Skip to main content
Log in

Item Response Models for Forced-Choice Questionnaires: A Common Framework

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

In forced-choice questionnaires, respondents have to make choices between two or more items presented at the same time. Several IRT models have been developed to link respondent choices to underlying psychological attributes, including the recent MUPP (Stark et al. in Appl Psychol Meas 29:184–203, 2005) and Thurstonian IRT (Brown and Maydeu-Olivares in Educ Psychol Meas 71:460–502, 2011) models. In the present article, a common framework is proposed that describes forced-choice models along three axes: (1) the forced-choice format used; (2) the measurement model for the relationships between items and psychological attributes they measure; and (3) the decision model for choice behavior. Using the framework, fundamental properties of forced-choice measurement of individual differences are considered. It is shown that the scale origin for the attributes is generally identified in questionnaires using either unidimensional or multidimensional comparisons. Both dominance and ideal point models can be used to provide accurate forced-choice measurement; and the rules governing accurate person score estimation with these models are remarkably similar.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Here, the standard coding procedure in the Thurstonian choice literature (Maydeu-Olivares & Böckenholt, 2005) is adopted. It is important to note that at the point of coding, no assumptions are made about the underlying distributions, decision mechanisms, etc.

  2. Negative weights in IP models do not make sense conceptually; hence, we use the squared values.

  3. The equality sign in (9) is arbitrary because the utilities are continuous variables and two utilities can never take on exactly the same value (Maydeu-Olivares & Böckenholt, 2005).

  4. Double exponential (or Gumbel; sometimes referred to as Weibull) distribution has the cumulative function \(F(z)=\exp (-\exp (-z))\).

  5. Andrich did not give his decision model a name—the name ‘forced endorsement model’ is suggested by the author of this article. This universal decision model should not be mistaken for specific IRT models for unfolding preference data that Andrich (1989, 1995) developed.

  6. Ignoring these assumptions and using the normal ogive link function results in probabilities that are different from those predicted by Thurstone’s model (12). Discrepancies depend on the combination of two utilities, and can be large. For normally distributed utilities, Thurstone’s model provides better prediction.

  7. Unlike in paired comparison tasks, it is assumed that no items are repeated across the forced-choice questionnaire. This is common practice in questionnaire design.

  8. For the partial ranking design whereby only one “best” item must be chosen, the multinomial logistic model of McFadden (16) may be used to model choices within each block, if it can be assumed that error variances are all equal. The choices for different blocks are independent conditional on the personal attributes, and the probability of observed response pattern is the product of probabilities of block choices. Since the assumption of equal error variances is often untenable, this model will not be considered further.

References

  • Andersen, E. B. (1976). Paired comparisons with individual differences. Psychometrika, 41(2), 141–157.

    Article  Google Scholar 

  • Andrich, D. (1989). A probabilistic IRT model for unfolding preference data. Applied Psychological Measurement, 13, 193–296.

    Article  Google Scholar 

  • Andrich, D. (1995). Hyperbolic cosine latent trait models for unfolding direct-responses and pairwise preferences. Applied Psychological Measurement, 20, 269–290.

    Article  Google Scholar 

  • Bartram, D. (2007). Increasing validity with forced-choice criterion measurement formats. International Journal of Selection and Assessment, 15, 263–272.

    Article  Google Scholar 

  • Bennett, J. F., & Hays, W. L. (1960). Multidimensional unfolding: Determining the dimensionality of ranked preference data. Psychometrika, 25, 27–43.

    Article  Google Scholar 

  • Block, J. (1961). The Q-sort method in personality assessment and psychiatric research. Springfield, IL: Charles C. Thomas.

    Book  Google Scholar 

  • Böckenholt, U. (2004). Comparative judgments as an alternative to ratings: Identifying the scale origin. Psychological Methods, 9, 453–465.

    Article  PubMed  Google Scholar 

  • Böckenholt, U. (2006). Thurstonian-based analyses: Past, present and future utilities. Psychometrika, 71(4), 615–629.

    Article  PubMed Central  PubMed  Google Scholar 

  • Bradley, R. A. (1953). Some statistical methods in taste testing and quality evaluation. Biometrics, 9, 22–38.

    Article  Google Scholar 

  • Bradley, R. A., & Terry, M. E. (1952). Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika, 39, 324–345.

    Google Scholar 

  • Brady, H. E. (1989). Factor and ideal point analysis for interpersonally incomparable data. Psychometrika, 54, 181–202.

    Article  Google Scholar 

  • Brown, A. (2009). Doing less but getting more: Improving forced-choice measures with IRT. Paper presented at the 24th annual conference of the Society for Industrial and Organizational Psychology, New Orleans, LA.

  • Brown, A. & Bartram, D. (2009–2011). OPQ32r Technical Manual. Surrey, UK: SHL Group.

  • Brown, A., & Maydeu-Olivares, A. (2010). Issues that should not be overlooked in the dominance versus ideal point controversy. Industrial and Organizational Psychology, 3, 489–493.

    Article  Google Scholar 

  • Brown, A., & Maydeu-Olivares, A. (2011). Item response modeling of forced-choice questionnaires. Educational and Psychological Measurement, 71, 460–502.

    Article  Google Scholar 

  • Brown, A., & Maydeu-Olivares, A. (2012). Fitting a Thurstonian IRT model to forced-choice data using Mplus. Behavior Research Methods, 44, 1135–1147.

    Article  PubMed  Google Scholar 

  • Brown, A., & Maydeu-Olivares, A. (2013). How IRT can solve problems of ipsative data in forced-choice questionnaires. Psychological Methods, 18, 36–52.

    Article  PubMed  Google Scholar 

  • Brown, A., & Maydeu-Olivares, A. (in press). Modeling forced-choice response formats. In P. Irwing, T. Booth, & D. Hughes (Eds.), The Wiley Handbook of Psychometric Testing. London: Wiley.

  • Chan, W. (2003). Analyzing ipsative data in psychological research. Behaviormetrika, 30, 99–121.

    Article  Google Scholar 

  • Cheung, M. W. L., & Chan, W. (2002). Reducing uniform response bias with ipsative measurement in multiple-group confirmatory factor analysis. Structural Equation Modeling, 9, 55–77.

    Article  Google Scholar 

  • Christiansen, N., Burns, G., & Montgomery, G. (2005). Reconsidering the use of forced-choice formats for applicant personality assessment. Human Performance, 18, 267–307.

    Article  Google Scholar 

  • Clemans, W. V. (1966). An analytical and empirical examination of some properties of ipsative measures. Psychometric Monographs, 14.

  • Coombs, C. H. (1950). Psychological scaling without a unit of measurement. Psychological Review, 57, 145–158.

    Article  PubMed  Google Scholar 

  • Coombs, C. H. (1960). A theory of data. Psychological Review, 67, 143–159.

    Article  PubMed  Google Scholar 

  • De Soete, G., & Carroll, J. D. (1983). A maximum likelihood method for fitting the wandering vector model. Psychometrika, 48, 553–566.

    Article  Google Scholar 

  • Drasgow, F., Chernyshenko, O. S., & Stark, S. (2009). Test theory and personality measurement. In J. N. Butcher (Ed.), Oxford handbook of personality assessment. London: Oxford University Press.

    Google Scholar 

  • Drasgow, F., Chernyshenko, O. S., & Stark, S. (2010). 75 years after Likert: Thurstone was right!. Industrial and Organizational Psychology: Perspectives on Science and Practice, 3, 465–476.

    Article  Google Scholar 

  • Huang, J., & Mead, A. D. (2014, July 7). Effect of personality item writing on psychometric properties of ideal-point and Likert scales. Psychological Assessment. Advance online publication. doi: http://dx.doi.org/10.1037/a0037273.

  • Jackson, D., Wroblewski, V., & Ashton, M. (2000). The impact of faking on employment tests: Does forced choice offer a solution? Human Performance, 13, 371–388.

    Article  Google Scholar 

  • Luce, R. D. (1959). Individual choice behavior: A theoretical analysis. New York, NY: Wiley.

    Google Scholar 

  • Luce, R. D. (1977). The choice axiom after twenty years. Journal of Mathematical Psychology, 15, 215–233.

    Article  Google Scholar 

  • Martin, B. A., Bowen, C.-C., & Hunt, S. T. (2002). How effective are people at faking on personality questionnaires? Personality and Individual Differences, 32, 247–256.

    Article  Google Scholar 

  • Maydeu-Olivares, A. (1999). Thurstonian modeling of ranking data via mean and covariance structure analysis. Psychometrika, 64, 325–340.

    Article  Google Scholar 

  • Maydeu-Olivares, A., & Böckenholt, U. (2005). Structural equation modeling of paired-comparison and ranking data. Psychological Methods, 10, 285–304.

    Article  PubMed  Google Scholar 

  • Maydeu-Olivares, A., & Böckenholt, U. (2008). Modeling subjective health outcomes: Top 10 reasons to use Thurstone’s method. Medical Care, 46, 346–348.

    Article  PubMed  Google Scholar 

  • Maydeu-Olivares, A., & Brown, A. (2010). Item response modeling of paired comparison and ranking data. Multivariate Behavioral Research, 45, 935–974.

    Article  PubMed  Google Scholar 

  • McCloy, R., Heggestad, E., & Reeve, C. (2005). A silk purse from the sow’s ear: Retrieving normative information from multidimensional forced-choice items. Organizational Research Methods, 8, 222–248.

    Article  Google Scholar 

  • McDonald, R. P. (1999). Test theory: A unified treatment. Mahwah, NJ: Erlbaum.

    Google Scholar 

  • McFadden, D. (1973). Conditional logit analysis of qualitative choice behavior. In P. Zarembka (Ed.), Frontiers in Econometrics. New York: Academic Press.

    Google Scholar 

  • McFadden, D. (1976). Quantal choice analysis: A survey. Annals of Economic and Social Measurement, 5, 363–390.

    Google Scholar 

  • McFadden, D. (2001). Economic choices. The American Economic Review, 91(3), 351–378.

    Article  Google Scholar 

  • Meade, A. (2004). Psychometric problems and issues involved with creating and using ipsative measures for selection. Journal of Occupational and Organisational Psychology, 77, 531–552.

    Article  Google Scholar 

  • Muthén, L.K. & Muthén, B.O. (1998–2012). Mplus user’s guide (7th ed.). Los Angeles, CA: Muthén & Muthén.

  • Roberts, J. S., Donoghue, J. R., & Laughlin, J. E. (2000). A general item response theory model for unfolding unidimensional polytomous responses. Applied Psychological Measurement, 24, 3–32.

    Article  Google Scholar 

  • Schwarz, N., Knäuper, B., Hippler, H. J., Noelle-Neumann, E., & Clark, L. (1991). Rating scales numeric values may change the meaning of scale labels. Public Opinion Quarterly, 55, 570–582.

    Article  Google Scholar 

  • Shepard, R. N. (1957). Stimulus and response generalization: A stochastic model relating generalization to distance in psychological space. Psychometrika, 22, 325–345.

    Article  Google Scholar 

  • Stark, S., Chernyshenko, O., & Drasgow, F. (2005). An IRT approach to constructing and scoring pairwise preference items involving stimuli on different dimensions: The multi-unidimensional pairwise-preference model. Applied Psychological Measurement, 29, 184–203.

    Article  Google Scholar 

  • Stark, S., & Drasgow, F. (2002). An EM approach to parameter estimation for the Zinnes and Griggs paired comparison IRT model. Applied Psychological Measurement, 26, 208–227.

    Article  Google Scholar 

  • Takane, Y. (1987). Analysis of covariance structures and probabilistic binary choice data. Communication and Cognition, 20, 45–62.

    Google Scholar 

  • Takane, Y. (1996). An item response model for multidimensional analysis of multiple choice data. Behaviormetrika, 23, 153–167.

    Article  Google Scholar 

  • Takane, Y., & De Leeuw, J. (1987). On the relationship between item response theory and factor analysis of discretized variables. Psychometrika, 52, 393–408.

    Article  Google Scholar 

  • Thurstone, L. L. (1927). A law of comparative judgment. Psychological Review, 34, 273–286.

    Article  Google Scholar 

  • Thurstone, L. L. (1928). Attitudes can be measured. American Journal of Sociology, 33, 529–554.

    Article  Google Scholar 

  • Thurstone, L. L. (1929). The measurement of psychological value. In T. V. Smith & W. K. Wright (Eds.), Essays in philosophy by seventeen doctors of philosophy of the University of Chicago (pp. 157–174). Chicago: Open Court.

    Google Scholar 

  • Thurstone, L. L. (1931). Rank order as a psychophysical method. Journal of Experimental Psychology, 14, 187–201.

    Article  Google Scholar 

  • Tsai, R. C., & Böckenholt, U. (2001). Maximum likelihood estimation of factor and ideal point models for paired comparison data. Journal of Mathematical Psychology, 45, 795–811.

    Article  Google Scholar 

  • Tversky, A. (1972). Elimination by aspects: A theory of choice. Psychological Review, 79(4), 281–299.

    Article  Google Scholar 

  • Vasilopoulos, N. L., Cucina, J. M., Dyomina, N. V., Morewitz, C. L., & Reilly, R. R. (2006). Forced-choice personality tests: A measure of personality and cognitive ability? Human Performance, 19, 175–199.

    Article  Google Scholar 

  • Zinnes, J. L., & Griggs, R. A. (1974). Probabilistic, multidimensional unfolding analysis. Psychometrika, 39, 327–350.

    Article  Google Scholar 

Download references

Acknowledgments

I am grateful to Alberto Maydeu-Olivares for his continuous support and helpful comments on an earlier draft of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anna Brown.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Brown, A. Item Response Models for Forced-Choice Questionnaires: A Common Framework. Psychometrika 81, 135–160 (2016). https://doi.org/10.1007/s11336-014-9434-9

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-014-9434-9

Keywords

Navigation