Skip to main content
Log in

Cultural Consensus Theory for the Ordinal Data Case

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

A Cultural Consensus Theory approach for ordinal data is developed, leading to a new model for ordered polytomous data. The model introduces a novel way of measuring response biases and also measures consensus item values, a consensus response scale, item difficulty, and informant knowledge. The model is extended as a finite mixture model to fit both simulated and real multicultural data, in which subgroups of informants have different sets of consensus item values. The extension is thus a form of model-based clustering for ordinal data. The hierarchical Bayesian framework is utilized for inference, and two posterior predictive checks are developed to verify the central assumptions of the model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
Figure 7.
Figure 8.
Figure 9.
Figure 10.
Figure 11.
Figure 12.
Figure 13.
Figure 14.
Figure 15.

Similar content being viewed by others

Notes

  1. Note that these latter two parameters have generally different interpretations than in current typical IRT models, and this is further explained in the discussion.

  2. One way of coding this (for Bayesian inference software) can be viewed in Appendix A.

  3. When C=2 categories, the sum-to-zero constraint provides that G=γ 1=0.

  4. These settings were found to work well in applications to both simulated and real data; researchers may consider exploring other prior distribution settings however, to optimize mixing within their specific applications.

  5. As a characteristic of being a finite mixture model, label-switching and mixing phenomena (Stephens 2000) are possible in the MC-LTRM, which need to be addressed prior to calculating convergence, model comparison statistics, and posterior predictive checks. For more information on handling these, see Section 3 in Anders and Batchelder (2012).

  6. All simulated data sets were generated from the hierarchical LTRM or MC-LTRM specified in Section 2.3, with hyperparameters randomly generated in a sensible interval within the diffuse hyperpriors using the uniform distribution.

  7. In respect to the discrete parameters (the e i of the MC-LTRM), inspection of their trace plots to assess that the chains have converged on similar distributions may be more preferred than the \(\hat{R}\) diagnostic; as the \(\hat{R}\) diagnostic may have difficulty in properly assessing convergence for discrete parameters whose chains have a high likelihood to converge to distributions with zero or approximate zero variance (as driven by a strong signal in the data).

  8. Similar simulations were performed with the MC-LTRM\(_{\lambda_{k} \not= 1}\) as the one here. The parameter recovery and eigenvalue check results were comparable. In addition, the DIC often preferred the MC-LTRM\(_{\lambda_{k} \not= 1}\) over the MC-LTRM\(_{\lambda_{k} = 1}\) for data simulated by the MC-LTRM\(_{\lambda_{k} \not= 1}\), unless there was very little heterogeneity in the simulated λ k values.

  9. In IRT models, the item difficulty is established by the response thresholds for each item, whereas in contrast, the response thresholds of the LTRM pertain to characteristic response biases of each informant.

  10. Note: another way of measuring consensus student abilities is to have N raters assess P productions from each student and link these hierarchically to M student ability traits (where M is the number of students involved); this has the benefit of using more than one assessment per grader on each student to calculate the consensus values of latent student ability and is the design of the HRM discussed previously. One could consider expanding the LTRM this way in future work.

References

  • Anders, R. (2013). CCTpack: cultural consensus theory applications to data. R package version 0.9.

  • Anders, R., & Batchelder, W.H. (2012). Cultural consensus theory for multiple consensus truths. Journal for Mathematical Psychology, 56, 452–469.

    Article  Google Scholar 

  • Batchelder, W.H., & Anders, R. (2012). Cultural consensus theory: comparing different concepts of cultural truth. Journal of Mathematical Psychology, 56, 316–332.

    Article  Google Scholar 

  • Batchelder, W.H., & Romney, A.K. (1986). The statistical analysis of a general condorcet model for dichotomous choice situations. In B. Grofman & G. Owen (Eds.), Information pooling and group decision making: proceedings of the second University of California Irvine conference on political economy (pp. 103–112). Greenwich: JAI Press.

    Google Scholar 

  • Batchelder, W.H., & Romney, A.K. (1988). Test theory without an answer key. Psychometrika, 53, 71–92.

    Article  Google Scholar 

  • Batchelder, W.H., & Romney, A.K. (1989). New results in test theory without an answer key. In Roskam (Ed.), Mathematical psychology in progress (pp. 229–248). Heidelberg: Springer.

    Chapter  Google Scholar 

  • Buhrmester, M., Kwang, T., & Gosling, S.D. (2011). Amazon’s mechanical turk a new source of inexpensive, yet high-quality, data?. Perspectives on Psychological Science, 6, 3–5.

    Article  Google Scholar 

  • Comrey, A.L. (1962). The minimum residual method of factor analysis. Psychological Reports, 11, 15–18.

    Article  Google Scholar 

  • De Boeck, P. (2008). Random item IRT models. Psychometrika, 73, 533–559.

    Article  Google Scholar 

  • DeCarlo, L.T. (2005). A model of rater behavior in essay grading based on signal detection theory. Journal of Educational Measurement, 42, 53–76.

    Article  Google Scholar 

  • Fischer, G.H., & Molenaar, I.W. (1995). Rasch models: recent developments and applications. New York: Springer.

    Book  Google Scholar 

  • Fox, C.R., & Tversky, A. (1995). Ambiguity aversion and comparative ignorance. The Quarterly Journal of Economics, 110, 585–603.

    Article  Google Scholar 

  • Fox, J. (2013). Polycor: polychoric and polyserial correlations. R package version 0.7-8.

  • Gelman, A., Carlin, J.B., Stern, H.S., & Rubin, D.B. (2004). Bayesian data analysis (2nd ed.). Boca Raton: Chapman and Hall/CRC.

    Google Scholar 

  • Gonzalez, R., & Wu, G. (1999). On the shape of the probability weighting function. Cognitive Psychology, 38, 129–166.

    Article  PubMed  Google Scholar 

  • Green, D.M., & Swets, J.A. (1966). Signal detection theory and psychophysics. New York: Wiley.

    Google Scholar 

  • Hruschka, D.J., Kalim, N., Edmonds, J., & Sibley, L. (2008). When there is more than one answer key: cultural theories of postpartum hemorrhage in Matlab, Bangladesh. Field Methods, 20, 315–337.

    Article  Google Scholar 

  • Johnson, V.E., & Albert, J.H. (1999). Ordinal data modeling. Statistics for social science and public policy. Berlin: Springer.

    Google Scholar 

  • Karabatsos, G., & Batchelder, W.H. (2003). Markov chain estimation methods for test theory without an answer key. Psychometrika, 68, 373–389.

    Article  Google Scholar 

  • Kruschke, J.K. (2011). Doing Bayesian data analysis: a tutorial with R and BUGS. Amsterdam: Elsevier/Academic Press.

    Google Scholar 

  • Lancaster, H., & Hamdan, M. (1964). Estimation of the correlation coefficient in contingency tables with possibly nonmetrical characters. Psychometrika, 29, 383–391.

    Article  Google Scholar 

  • Lee, M.D. (2011). How cognitive modeling can benefit from hierarchical Bayesian models. Journal of Mathematical Psychology, 55, 1–7.

    Article  Google Scholar 

  • Lord, F.M., Novick, M.R., & Birnbaum, A. (1968). Statistical theories of mental test scores (Vol. 47). Reading: Addison-Wesley.

    Google Scholar 

  • Macmillan, N.A., & Creelman, C.D. (2005). Detection theory: a users guide (2nd ed.). Mahwah: Erlbaum.

    Google Scholar 

  • Nering, M.L., & Ostini, R. (2011). Handbook of polytomous item response theory models. New York: Taylor and Francis.

    Google Scholar 

  • Patz, R.J., Junker, B.W., Johnson, M.S., & Mariano, L.T. (2002). The hierarchical rater model for rated test items and its application to large-scale educational assessment data. Journal of Educational and Behavioral Statistics, 27, 341–384.

    Article  Google Scholar 

  • Plummer, M. (2003). JAGS: a program for analysis of Bayesian graphical models using Gibbs sampling.

  • Plummer, M. (2012). Rjags: Bayesian graphic models using MCMC. R package version 3.2.0. http://CRAN.R-project.org/package=rjags.

  • Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: Denmarks Paedagogiske Institute.

    Google Scholar 

  • Revelle, W. (2012). psych: procedures for psychological, psychometric, and personality research. Northwestern University Evanston, Illinois. R package version 1.2.1.

  • Rigdon, E.E. (2010). Polychoric correlation coefficient. In N.J. Salkind (Ed.), Encyclopedia of research design (pp. 1046–1049). Thonsand Oaks: Sage.

    Google Scholar 

  • Romney, A.K., & Batchelder, W.H. (1999). Cultural consensus theory. In R. Wilson & F. Keil (Eds.), The MIT encyclopedia of the cognitive sciences (pp. 208–209). Cambridge: MIT Press.

    Google Scholar 

  • Romney, A.K., Batchelder, W.H., & Weller, S.C. (1987). Recent applications of cultural consensus theory. American Behavioral Scientist, 31, 163–177.

    Article  Google Scholar 

  • Romney, A.K., Weller, S.C., & Batchelder, W.H. (1986). Culture as consensus: a theory of culture and informant accuracy. American Anthropologist, 88, 313–338.

    Article  Google Scholar 

  • Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement.

  • Spearman, C.E. (1904). ‘General intelligence’ objectively determined and measured. The American Journal of Psychology, 15, 72–101.

    Article  Google Scholar 

  • Spiegelhalter, D.J., Best, N.G., Carlin, B.P., & van der Linde, A. (2002). Bayesian measures of model complexity and fit (with discussion). Journal of the Royal Statistical Society, Series B, 6, 583–640.

    Article  Google Scholar 

  • Sprouse, J., Wagers, M., & Phillips, C. (2012). A test of the relation between working-memory capacity and syntactic island effects. Language, 88, 82–123.

    Article  Google Scholar 

  • Stephens, M. (2000). Dealing with label switching in mixture models. Journal of the Royal Statistical Society. Series B. Statistical Methodology, 62, 795–809.

    Article  Google Scholar 

  • Takane, Y., & de Leeuw, J. (1987). On the relationship between item response theory and factor analysis of discretized variables. Psychometrika, 52, 393–408.

    Article  Google Scholar 

  • van der Linden, W.J., & Hambleton, R.K. (1997). Handbook of modern item response theory. Berlin: Springer.

    Google Scholar 

  • Weller, S.W. (2007). Cultural consensus theory: applications and frequently asked questions. Field Methods, 19, 339–368.

    Article  Google Scholar 

  • Zhang, H., & Maloney, L.T. (2012). Ubiquitous log odds: a common representation of probability and frequency distortion in perception, action and cognition. Frontiers in Neuroscience, 6.

Download references

Acknowledgements

This research was funded by grants to the second author from the U.S. Air Force Office of Scientific Research (AFOSR), the Army Research Office (ARO), and an award from the Oak Ridge Institute for Science and Education (ORISE).

We would like to thank Jon Sprouse for making available to us his grammaticality data set. In addition, we are grateful to Zita Oravecz for her helpful advice and comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Royce Anders.

Appendices

Appendix A. JAGS Model Code

1.1 A.1 Latent Truth Rater Model (LTRM) Code

model{

#Data

for (i in 1:n){

 for (k in 1:m){

  tau[i,k] <- E[i]/lam[k]

  pX[i,k,1] <- pnorm((a[i]*g[1]) + b[i],T[k],tau[i,k])

  for (c in 2:(C-1)){

   pX[i,k,c] <- pnorm((a[i]*g[t]) + b[i],T[k],tau[i,k]) - sum(pX[i,k,1:(c-1)])}

  pX[i,k,C] <- (1 - sum(pX[i,k,1:(C-1)]))

  X[i,k] ~ dcat(pX[i,k,1:T])}}

#Parameters

for (k in 1:m){

 T[k] ~ dnorm(Tmu,Ttau)

 ilogitT[k] <- ilogit(T[k])

 lam[k] ~ dgamma(lamtau,lamtau)}

for (c in 1:(C-2)){

  dg[c] ~ dnorm(0,.1)}

dg2[1:(C-2)] <- dg[1:(C-2)]

dg2[C-1] <- -sum(dg[1:(C-2)])

g <- sort(dg2)

for(c in 1:(C-1)){

 ilogitg[c] <- ilogit(g[c])}

for (i in 1:n){

 E[i] ~ dgamma(pow(Emu,2)*Etau,Emu*Etau)

 a[i] ~ dgamma(atau,atau)

 b[i] ~ dnorm(bmu,btau)}

#Hyperparameters

Tmu ~ dnorm(0,.1)

Ttau  ~ dgamma(1,.1)

bmu <- 0

btau ~ dgamma(1,.1)

amu <- 1

atau ~ dgamma(1,.1)

Emu ~ dgamma(4,4)

Etau ~ dgamma(4,4)

lammu <- 1

lamtau ~ dgamma(4,4)}

1.2 A.2 Multi-Culture Latent Truth Rater Model (MC-LTRM) Code

model{

#Data

for (i in 1:n){

 for (k in 1:m){

 tau[i,k] <- E[i]/lam[k]

 pX[i,k,1] <- pnorm((a[i]*g[1,om[i]]) + b[i],T[k,om[i]],tau[i,k])

 for (c in 2:(C-1)){

  pX[i,k,c] <- pnorm((a[i]*g[t,om[i]]) + b[i],T[k,om[i]],tau[i,k])

                                       - sum(pX[i,k,1:(c-1)])}

 pX[i,k,C] <- (1 - sum(pX[i,k,1:(C-1)]))

 X[i,k] ~ dcat(pX[i,k,1:C])}}

#Parameters

for (v in 1:V){

 for (k in 1:m){

 T[k,v] ~ dnorm(Tmu[v],Ttau[v])

 ilogitT[k,v] <- ilogit(T[k,v])}

 for (c in 1:(C-2)){

  dg[c,v] ~ dnorm(0,.1)}

 dg2[1:(C-2),v] <- dg[1:(C-2),v]

 dg2[T-1,v] <- -sum(dg[1:(C-2),v])

 g[1:(C-1),v] <- sort(dg2[1:(C-1),v])

 for(c in 1:(C-1)){

  ilogitg[c,v] <- ilogit(g[c,v])}}

for (k in 1:m){

 lam[k] ~ dgamma(1*lamtau,1*lamtau)}

for (i in 1:n){

 om[i] ~ dcat(pi)

 E[i] ~ dgamma(pow(Emu[om[i]],2)*Etau[om[i]],Emu[om[i]]*Etau[om[i]])

 a[i] ~ dgamma(atau[om[i]],atau[om[i]])

 b[i] ~ dnorm(bmu[om[i]],btau[om[i]])}

pi[1:V] ~ ddirch(alpha)

#Hyperparameters

for (v in 1:V){

 alpha[v] <- 1

 Tmu[v] ~ dnorm(0,.1)

 Ttau[v]  ~ dgamma(1,.1)

 bmu[v] <- 0

 btau[v] ~ dgamma(1,.1)

 amu[v] <- 1

 atau[v] ~ dgamma(1,.1)

 Emu[v] ~ dgamma(4,4)

 Etau[v] ~ dgamma(4,4)}

 lammu <- 1

 lamtau ~ dgamma(1,.1)}

Appendix B. Cities Questionnaire

2.1 B.1 Informants rated the following for either Irvine, New York, or Miami

  1. 1.

    Rate the amount of rain experienced during the fall

  2. 2.

    Rate the amount of snow experienced during the winter

  3. 3.

    Rate the level of humidity in the summer

  4. 4.

    Rate the general wind factor during the fall

  5. 5.

    Rate how cold it is during the winter

  6. 6.

    Rate how hot it is during the summer

  7. 7.

    Rate the range of temperatures experienced across the year

  8. 8.

    Rate the amount of people that use public transportation as the primary mode of transport

  9. 9.

    Rate the amount of crime that occurs

  10. 10.

    Rate the amount of ethnic/racial diversity

  11. 11.

    Rate how liberally minded the general population is

  12. 12.

    Rate how much “nightlife” the city has

  13. 13.

    Rate the population density of the city

  14. 14.

    Rate how close the ocean is

  15. 15.

    Rate how modernized the city is

  16. 16.

    Rate the air quality (smog level) of the city

  17. 17.

    Rate the cleanliness of the city

  18. 18.

    Rate how well-known the city is compared to other cities in the state

  19. 19.

    Rate the cost of living in the city

  20. 20.

    Rate the amount of homeless people living in the city

Appendix C. Spearman and Item Difficulty Properties of the Model

3.1 C.1 Spearman Law Property with Proof

Theorem 1

Suppose that Axioms 1*, 25, and 6* hold for the MC-LTRM. Then given fixed values of \(\pmb{\mathcal{T}}\), E, Λ, and Ω: ∀i,j=1,…,Nij,

$$ \begin{aligned} \rho(Y_{iK},Y_{jK}) = \rho(Y_{iK},T_{\Omega_i K})\rho(Y_{jK},T_{\Omega_j K}) \rho (T_{\Omega_i K},T_{\Omega_j K}) , \end{aligned} $$
(C.1)

where K is a random variable representing item indices, with probability density \(\operatorname{Pr}(K=k) = 1/M \ \forall k = 1, \ldots, M \).

Proof

Note that \(Y_{ik} = T_{\Omega_{i} k} + \epsilon_{ik}\) for the MC-LTRM and that ∀i,k,E(ϵ ik )=0. Further, by Axiom 4, conditional independence requires that all of the \((\epsilon _{ik})_{i=1}^{N}\) are conditionally independent for fixed \(\pmb{\mathcal {T}}\), E, Λ, and Ω. From these assumptions the terms in (C.1) can be calculated as follows. First, note that with the matrix of latent appraisals, Y=(Y ik ) N×M , the correlation between two informants over items is

$$ \rho(Y_{iK},Y_{jK}) = \frac{\mbox{Cov}(Y_{iK},Y_{jK})}{\sqrt{\mbox {Var}(Y_{iK})\mbox{Var}(Y_{jK})}} = \frac {E(Y_{iK}Y_{jK})-E(Y_{iK})E(Y_{jK})}{\sqrt{\mbox{Var}(Y_{iK})\mbox {Var}(Y_{jK})}} . $$
(C.2)

Next, note that

$$ E(Y_{iK}Y_{jK}) = E_K \bigl[E(Y_{iK}Y_{jK} \mid K)\bigr] = \frac{1}{M} \sum_{k=1}^M E\bigl[(T_{ik} + \epsilon_{ik}) (T_{jk}+ \epsilon_{jk})\bigr] . $$
(C.3)

Now from the zero mean and conditional independence properties of the error random variables, it is clear that (C.3) becomes

$$ E(Y_{iK}Y_{jK}) = \frac{1}{M} \sum _{k=1}^M T_{ik}T_{jk} . $$

Similarly, other aspects of (C.2) can be computed, such as

$$ E(Y_{iK}) = \frac{1}{M} \sum_{k=1}^M E(T_{ik}+\epsilon_{ik}) = E(T_{iK}) , $$

so the result is

$$ \rho(Y_{iK},Y_{jK}) = \frac{\mbox{Cov}(T_{iK},T_{jK})}{\sqrt{\mbox {Var}(Y_{iK})\mbox{Var}(Y_{jK})}} . $$
(C.4)

Next consider the terms on the right side of (C.1). Using the same methods, it is easy to calculate

$$ \rho(Y_{iK},T_{iK}) = \frac{\sum_{k=1}^M T_{ik}^2 / M - E^2 (T_{iK})}{\sqrt{\mbox{Var}(Y_{iK})\mbox{Var}(T_{iK})}} = \sqrt{ \frac{\mbox{Var}(T_{iK})}{\mbox{Var}(Y_{iK})}} ,\qquad \rho (Y_{jK},T_{jK}) = \sqrt{ \frac{\mbox{Var}(T_{jK})}{\mbox{Var}(Y_{jK})}} , $$

where the computational formula for the variance of a random variable, Var(X)=E(X 2)−E 2(X), is used. Finally, the third correlation is obtained as

$$ \rho(T_{iK},T_{jK}) = \frac{\mbox{Cov}(T_{iK},T_{jK})}{\sqrt{\mbox{Var}(T_{iK})\mbox{Var}(T_{jK})}} . $$

When these three correlations are multiplied, the result is (13). □

While the triple correlation property behind the latent appraisals of the MC-LTRM is given in (13), note that if all informants share the same cultural truth (V=1), (13) reduces to

$$ \begin{aligned} \rho(Y_{iK},Y_{jK}) = \rho(Y_{iK},T_{K})\rho (Y_{jK},T_{K}) , \end{aligned} $$
(C.5)

3.2 C.2 Item Difficulty Property with Proof

Theorem 2

Suppose that Axioms 16 hold for the LTRM and item difficulty is neutral (λ k =1). Then, for fixed T and E,

$$ \forall k=1, \ldots, M, \quad V_k ^* = \sum _{i=1}^N \frac {\lambda_k}{E_i} = \sum _{i=1}^N \frac{1}{E_i} . $$

Proof

First, using conditional independence, for any item k, we have

$$ V_k^* = \mbox{Var} \Biggl(\sum_{i=1}^N Y_{ik}\Biggr) = \sum_{i=1}^N \mbox{Var}(Y_{ik}) = \sum_{i=1}^N Var(T_k + \epsilon_{ik}) = \sum _{i=1}^N Var(\epsilon_{ik}). $$

From (2), Var(ϵ ik )=1/τ ik , and under the assumption that λ k =1 in (3),

$$ V_k^* = \sum_{i=1}^N \frac{1}{E_i} , $$

and this is the same for all items k. □

3.3 C.3 Rasch Application for Including Item Difficulty

If the item difficulty is assumed heterogeneous (varies over items), then the parameter λ k >0 scales the magnitude of the informant precision, depending on the difficulty of the item; for example, assigning student grades is a type of categorization, and one might conceive of essays that are easier to grade than others. The logic behind incorporating the item difficulty in the model as in (3) is achieved by following earlier CCT work that adopts a form of the Rasch (1960) model (Batchelder & Romney, 1988; Karabatsos & Batchelder, 2003).

The Rasch model applies to a doubly indexed parameter, p ik with space (0,1), and since τ ik ∈(0,∞), one can see how the relationship in (3) develops by applying the Rasch model to

$$ \tau_{ik}^{*} = \frac{\tau_{ik}}{(1+\tau_{ik})} \in(0,1) . $$

Then the Rasch model can be parameterized in several different ways (e.g., Fischer and Molenaar 1995), and in this case, a convenient way is to use the parameterization

$$ \tau_{ik}^{*} = \frac{E_i}{E_i+\lambda_{k}} , $$
(C.6)

where E i ,λ k >0. Note that solving (C.6) for τ ik yields (3).

The key to successfully reducing the number of estimated parameters for doubly indexed, subject-item quantities, such as τ ik , is to assume that the separate contributions, which form (3), are additive on some scale with no interaction. In this case, additivity on a logarithmic scale is demonstrated by lnτ ik =lnE i −lnλ k . However, an inherited issue by using the Rasch model is an identifiability problem that must be handled during estimation. One can see the identifiability problem by defining \(\forall c > 0, E_{i}^{*} = c E_{i}, \lambda_{k}^{*} = c \lambda_{k}\); then \(\frac{E_{i}}{\lambda_{k}} = \frac{E_{i}^{*}}{\lambda_{k}^{*}}\). When estimating a fixed-effects version of the model, the identifiability issue can be handled in several ways, such as by setting the mean of the item difficulty parameters to a neutral value, \(\bar{\lambda} = 1\). In the hierarchical version of the model, the hierarchical distribution mean can be set to a neutral value, μ λ =1, which was the method used in our analysis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Anders, R., Batchelder, W.H. Cultural Consensus Theory for the Ordinal Data Case. Psychometrika 80, 151–181 (2015). https://doi.org/10.1007/s11336-013-9382-9

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-013-9382-9

Key words

Navigation