Skip to main content

Advertisement

Log in

Evaluating and tracking qualitative content coder performance using item response theory

  • Published:
Quality & Quantity Aims and scope Submit manuscript

Abstract

Content analysis of traditional and social media has a central role in investigating features of media content, measuring media exposure, and calculation of media effects. The reliability of content coding is usually evaluated using “kappa-centric” agreement measures, but these measures produce results that aggregate individual coder decisions which obscure the performance of individual coders. Using a data set of 105 advertisements for sports and energy drinks media content coded by five coders, we demonstrate that Item Response Theory can track coder performance over time and give coder-specific information on the consistency of decisions over qualitatively coded objects. We conclude that IRT should be added to content analysts’ tool kit of useful methodologies to track and evaluate content coders’ performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. Because IRT results are factor analysis derived, extensions of the IRT model for measuring additional aspects of “coder agreement” are possible (Dayton 2008; Porcu and Giambona 2017; Uebersax 1992). However, this would entail – at least in Stata – the use of its SEM procedures and a comprehensive knowledge of confirmatory factor analysis.

  2. Three parameter models are used to model guessing in analysis of knowledge and cognitive ability test items but do not apply here because trained coders should never be guessing.

  3. That the IRT approach is more efficient given many coding decisions is obvious. We use only 4 examples here, but the TeenADE data set has over 100 coded ad features and with 5 coders would require at least 1000 kappa-centric measures to evaluate all coding decisions, i.e., our Table 2 with a 100 columns of ten kappa-centric entries: no one can examine such a table sufficiently carefully.

References

  • Aho, K., Derryberry, D., Peterson, T.: Model selection for ecologists: the worldviews of AIC and BIC. Ecology 95(3), 631–636 (2014)

    Article  Google Scholar 

  • Artstein, R., Poesio, M.: Inter-coder agreement for computational linguistics. Comput. Linguistics 34(4), 555–596 (2008)

    Article  Google Scholar 

  • Banerjee, M., Capozzoli, M., McSweeney, L., Sinha, D.: Beyond kappa: A review of interrater agreement measures. Can. J. Stat. 27(1), 3–23 (1999)

    Article  Google Scholar 

  • Barker, A.B., Whittamore, K., Britton, J., Murray, R.L., Cranwell, J.: A content analysis of alcohol content in UK television. J. Public Health, fdy142–fdy142 (2018). doi:https://doi.org/10.1093/pubmed/fdy142

  • Barnhart, H.X., Haber, M.J., Lin, L.I.: An overview on assessing agreement with continuous measurements. J. Biopharm. Stat. 17(4), 529–569 (2007). doi:https://doi.org/10.1080/10543400701376480

    Article  Google Scholar 

  • Belur, J., Tompson, L., Thornton, A., Simon, M.: Interrater reliability in systematic review methodology: exploring variation in coder decision-making. Sociol. Methods Res., 1–29 (2018). doi:https://doi.org/10.1177/0049124118799372

  • Beullens, K., Schepers, A.: Display of Alcohol Use on Facebook: A Content Analysis. CyberPsychology Behav. Social Netw., 16(7), (2013). doi:https://doi.org/10.1089/cyber.2013.0044

  • Bleakley, A., Fishbein, M., Hennessy, M., Jordan, A., Chernin, A., Stevens, R.: Developing respondent based multi-media measures of exposure to sexual content. Commun. Methods Measures 2(1 & 2), 43–64 (2008)

    Article  Google Scholar 

  • Bleakley, A., Ellithorpe, M.E., Hennessy, M., Jamieson, P.E., Khurana, A., Weitz, I.: Risky movies, risky behaviors, and ethnic identity among Black adolescents. Soc. Sci. Med. 195, 131–137 (2017). doi:https://doi.org/10.1016/j.socscimed.2017.10.024

    Article  Google Scholar 

  • Brennan, R.L., Prediger, D.J.: Coefficient kappa: Some uses, misuses, and alternatives. Educ. Psychol. Meas. 41(3), 687–699 (1981)

    Article  Google Scholar 

  • Brown, T.: Confirmatory Factor Analysis for Applied Research, 2nd edn. Guilford, New York (2015)

    Google Scholar 

  • Brownbill, A.L., Miller, C.L., Smithers, L.G., Braunack-Mayer, A.J.: Selling function: the advertising of sugar-containing beverages on Australian television. Health Promot. Int. (2020). doi:https://doi.org/10.1093/heapro/daaa052

    Article  Google Scholar 

  • Buchanan, L., Yeatman, H., Kelly, B., Kariippanon, K.: A thematic content analysis of how marketers promote energy drinks on digital platforms to young Australians. Aust. N. Z. J. Public Health. 42(6), 530–531 (2018). doi:https://doi.org/10.1111/1753-6405.12840

    Article  Google Scholar 

  • Burke, L.M., Hawley, J.A.: Swifter, higher, stronger: What’s on the menu? Science. 362(6416), 781–787 (2018). doi:https://doi.org/10.1126/science.aau2093

    Article  Google Scholar 

  • Byrt, T., Bishop, J., Carlin, J.B.: Bias, prevalence and kappa. J. Clin. Epidemiol. 46(5), 423–429 (1993)

    Article  Google Scholar 

  • Carletta, J.: Assessing agreement on classification tasks: the kappa statistic. Comput. Linguistics 22(2), 249–254 (1996)

    Google Scholar 

  • Cavazos-Rehg, P.A., Krauss, M., Fisher, S.L., Salyer, P., Grucza, R.A., Bierut, L.J.: Twitter chatter about marijuana. J. Adolesc. Health 56(2), 139–145 (2015a)

    Article  Google Scholar 

  • Cavazos-Rehg, P.A., Krauss, M.J., Sowles, S.J., Bierut, L.J.: “Hey everyone, I’m drunk.” An evaluation of drinking-related Twitter chatter. J. Stud. Alcohol Drug 76(4), 635–643 (2015b)

    Article  Google Scholar 

  • Coates, A.E., Hardman, C.A., Halford, J.C.G., Christiansen, P., Boyland, E.J.: Food and Beverage Cues Featured in YouTube Videos of Social Media Influencers Popular With Children: An Exploratory Study. Front. Psychol., 10(2142), (2019). doi:https://doi.org/10.3389/fpsyg.2019.02142

  • Coleman, R., Hatley Major, L.: Ethical health communication: A content analysis of predominant frames and primes in public service announcements. J. Mass Media Ethics. 29(2), 91–107 (2014). doi:https://doi.org/10.1080/08900523.2014.893773

    Article  Google Scholar 

  • Dayton, C.M.: An introduction to latent class analysis. In: Menard, S. (ed.) Handbook of longitudinal research: Design, measurement, and analysis, pp. 357–371. Academic Press (2008)

  • DeJong, R.C.W., Bryn Austin, S., William: US federally funded television public service announcements (PSAs) to prevent HIV/AIDS: A content analysis. J. Health Communication. 6(3), 249–263 (2001). doi:https://doi.org/10.1080/108107301752384433

    Article  Google Scholar 

  • El-Khoury, J., Bilani, N., Abu-Mohammad, A., Ghazzaoui, R., Kassir, G., Rachid, E., Hayek, E., S: Drugs and Alcohol Themes in Recent Feature Films: A Content Analysis. J. Child Adolesc. Subst. Abuse 28(1), 8–14 (2019)

    Article  Google Scholar 

  • Emmers-Sommer, T.M., Allen, M.: Surveying the effect of media effects: A meta-analytic summary of the media effects research in Human Communication Research. Hum. Commun. Res. 25(4), 478–497 (1999)

    Article  Google Scholar 

  • Feinstein, A.R., Cicchetti, D.V.: High agreement but low kappa: I. The problems of two paradoxes. J. Clin. Epidemiol. 43(6), 543–549 (1990)

    Article  Google Scholar 

  • Fishbein, M., Ajzen, I.: Predicting and changing behavior: The reasoned action approach. Taylor & Francis (2010)

  • Fleiss, J.L., Cohen, J.: The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educ. Psychol. Meas. 33(3), 613–619 (1973)

    Article  Google Scholar 

  • Garrison, D.R., Cleveland-Innes, M., Koole, M., Kappelman, J.: Revisiting methodological issues in transcript analysis: Negotiated coding and reliability. The Internet and Higher Education 9(1), 1–8 (2006)

    Article  Google Scholar 

  • Glockner-Rist, A., Hoijtink, H.: The best of both worlds: Factor analysis of dichotomous data using item response theory and structural equation modeling. Struct. Equ. Model. 10(4), 544–565 (2003)

    Article  Google Scholar 

  • Gwet, K.L.: Inter-rater reliability: dependency on trait prevalence and marginal homogeneity. Stat. Methods Inter-Rater Reliab. Assess. Ser. 2(1), 1–9 (2002)

    Google Scholar 

  • Gwet, K.L.: Computing inter-rater reliability and its variance in the presence of high agreement. Br. J. Math. Stat. Psychol. 61(1), 29–48 (2008)

    Article  Google Scholar 

  • Gwet, K.L.: Handbook of inter-rater reliability: The definitive guide to measuring the extent of agreement among raters, vol. 2. Advanced Analytics, LLC (2014a)

  • Gwet, K.L.: Handbook of inter-rater reliability: The definitive guide to measuring the extent of agreement among raters (vol. 1: Analysis of Categorical Ratings): Advanced Analytics, LLC (2014b)

  • Hallgren, K.A.: Computing inter-rater reliability for observational data: an overview and tutorial. Tutorials in quantitative methods for psychology. 8(1), 23–34 (2012). doi:https://doi.org/10.20982/tqmp.08.1.p023

    Article  Google Scholar 

  • Harris, J.L., Felming-Milici, F., Kibwana-Jaff, A., Phaneuf, L.: Sugary drink advertising to you: Continued barrier to public health progress. University of Connecticut Rudd Center for Food Policy and Obesity, Storrs (2020)

    Google Scholar 

  • Hayes, A.F., Krippendorff, K.: Answering the call for a standard reliability measure for coding data. Communication Methods and Measures 1(1), 77–89 (2007)

    Article  Google Scholar 

  • Hennessy, M., Bleakley, A., Piotrowski, J.T., Mallya, G., Jordan, A.: Sugar-sweetened beverage consumption by adult caregivers and their children: the role of drink features and advertising exposure. Health Educ. Behav. 42(5), 677–686 (2015)

    Article  Google Scholar 

  • Hennessy, M., Bleakley, A., Ellithorpe, M.E., Maloney, E., Jordan, A.B., Stevens, R.: Reducing Unhealthy Normative Behavior: The Case of Sports and Energy Drinks. Health Educ. Behav., 1–12 (2021). doi:https://doi.org/10.1177/10901981211055468

  • Jordan, A., Kunkel, D., Manganello, J., Fishbein, M. (eds.): Media messages and public health: A decisions approach to content analysis. Routledge (2010)

  • Krauss, M., Grucza, R., Bierut, L., Cavazos-Rehg, P.: “Get drunk. Smoke weed. Have fun.”: A content analysis of tweets about marijuana and alcohol. Am. J. Health Promotion 31(3), 200–208 (2017)

    Article  Google Scholar 

  • Krippendorff, K.: Bivariate agreement coefficients for reliability of data. Sociol. Methodol. 2, 139–150 (1970)

    Article  Google Scholar 

  • Krippendorff, K.: Reliability in content analysis: Some common misconceptions and recommendations. Hum. Commun. Res. 30(3), 411–433 (2004)

    Google Scholar 

  • Krippendorff, K.: Content analysis: An introduction to its methodology. Sage publications (2018)

  • Lacy, S., Watson, B.R., Riffe, D., Lovejoy, J.: Issues and best practices in content analysis. Journalism & Mass Communication Quarterly 92(4), 791–811 (2015)

    Article  Google Scholar 

  • Marriott, B.P., Hunt, K.J., Malek, A.M., Newman, J.C.: Trends in intake of energy and total sugar from sugar-sweetened beverages in the United States among children and adults, NHANES 2003–2016. Nutrients, 11(9), (2019)

  • Moran, A.J., Roberto, C.A.: Health warning labels correct parents’ misperceptions about sugary drink options. Am. J. Prev. Med. 55(2), e19–e27 (2018)

    Article  Google Scholar 

  • Munsell, C.R., Harris, J.L., Sarda, V., Schwartz, M.B.: Parents’ beliefs about the healthfulness of sugary drink options: opportunities to address misperceptions. Public Health. Nutr. 19(1), 46–54 (2016)

    Article  Google Scholar 

  • Mus, S., Rozas, L., Barnoya, J., Busse, P.: Gender representation in food and beverage print advertisements found in corner stores around schools in Peru and Guatemala. BMC Res. Notes. 14(1), 402 (2021). doi:https://doi.org/10.1186/s13104-021-05812-4

    Article  Google Scholar 

  • Neuendorf, K.A.: The content analysis guidebook. Sage, Thousand Oaks (2017)

    Book  Google Scholar 

  • O’Keefe, D.J.: Elaboration Likelihood Model. In: Donsbach, W. (ed.) The international encyclopedia of communication (Vol, IV, pp. 1475–1480. Blackwell, Oxford (2008)

    Google Scholar 

  • Oleinik, A., Popova, I., Kirdina, S., Shatalova, T.: On the choice of measures of reliability and validity in the content-analysis of texts. Qual. Quant. 48(5), 2703–2718 (2014)

    Article  Google Scholar 

  • Peteet, B., Roundtree, C., Dixon, S., Mosley, C., Miller-Roenigk, B., White, J.,. . McCuistian, C.: ‘Codeine crazy:’a content analysis of prescription drug references in popular music. J. Youth Stud., 1–17 (2020). doi:https://doi.org/10.1080/13676261.2020.1801992

  • Petty, R.E., Cacioppo, J.T.: Communication and persuasion: Central and peripheral routes to attitude change. Springer-Verlag, New York (1986)

    Book  Google Scholar 

  • Porcu, M., Giambona, F.: Introduction to latent class analysis with applications. J. Early Adolescence. 37(1), 129–158 (2017). doi:https://doi.org/10.1177/0272431616648452

    Article  Google Scholar 

  • Potter, W.J., Riddle, K.: A content analysis of the media effects literature. Journalism & Mass Communication Quarterly 84(1), 90–104 (2007)

    Article  Google Scholar 

  • Primack, B.A., Dalton, M.A., Carroll, M.V., Agarwal, A.A., Fine, M.J.: Content analysis of tobacco, alcohol, and other drugs in popular music. Arch. Pediatr. Adolesc. Med. 162(2), 169–175 (2008)

    Article  Google Scholar 

  • Raykov, T., Marcoulides, G.A.: A Course in Item Response Theory and Modeling with Stata. Stata Press, College Station (2018)

    Google Scholar 

  • Reise, S., Ainsworth, A., Haviland, M.: Item response theory: Fundamentals, applications, and promise in psychological research. Curr. Dir. Psychol. Sci. 14(2), 95–101 (2005)

    Article  Google Scholar 

  • Riff, D., Lacy, S., Watson, B., Fico, F.: Analyzing media messages: Using quantitative content analysis in research, Fourth edn. Routledge, New York (2019)

    Book  Google Scholar 

  • Russell, C.A., Russell, D.W., Grube, J.W.: Nature and impact of alcohol messages in a youth-oriented television series. J. Advertising. 38(3), 97–112 (2009). doi:https://doi.org/10.2753/JOA0091-3367380307

    Article  Google Scholar 

  • Shrout, P.E., Fleiss, J.L.: Intraclass correlations: uses in assessing rater reliability. Psychol. Bull. 86(2), 420–428 (1979)

    Article  Google Scholar 

  • Singh, J.: Tackling measurement problems with Item Response Theory: Principles, characteristics, and assessment, with an illustrative example. J. Bus. Res. 57(2), 184–208 (2004). doi:https://doi.org/10.1016/S0148-2963(01)00302-2

    Article  Google Scholar 

  • Skalski, P.D., Neuendorf, K.A., Cajigas, J.A.: Content analysis in the interactive media age. In: Neuendorf, K.A. (ed.) The content analysis guidebook, pp. 201–242. Sage, Thousand Oaks (2017)

    Chapter  Google Scholar 

  • StataCorp: Stata: Release 16 Statistical Software. StataCorp LP, College Station (2019)

    Google Scholar 

  • Stern, S., Morr, L.: Portrayals of teen smoking, drinking, and drug use in recent popular movies. J. Health Communication. 18(2), 179–191 (2013). doi:https://doi.org/10.1080/10810730.2012.688251

    Article  Google Scholar 

  • Streiner, D.L.: Learning how to differ: Agreement and reliability statistics in psychiatry. Can. J. Psychiatry 40(2), 60–66 (1995)

    Article  Google Scholar 

  • Uebersax, J.S.: Modeling approaches for the analysis of observer agreement. Invest. Radiol. 27(9), 738–743 (1992)

    Article  Google Scholar 

  • Ullman, J.B., Bentler, P.M.: Structural equation modeling. In: Weiner, I.B. (ed.) Handbook of Psychology, Second edn., pp. 661–690. Wiley (2012)

  • Underwood, J.M., Brener, N., Thornton, J., Harris, W.A., Bryan, L.N., Shanklin, S.L.,. . Chyen, D.: Overview and methods for the Youth Risk Behavior Surveillance System—United States, 2019. MMWR supplements, 69(1), 1 (2020)

  • Vassallo, A.J., Kelly, B., Zhang, L., Wang, Z., Young, S., Freeman, B.: Junk Food Marketing on Instagram: Content Analysis. JMIR Public. Health and Surveillance, 4(2) (2018). doi:https://doi.org/10.2196/publichealth.9594

  • Vercammen, K.A., Koma, J.W., Bleich, S.N.: Trends in energy drink consumption among US adolescents and adults, 2003–2016. Am. J. Prev. Med. 56(6), 827–833 (2019)

    Article  Google Scholar 

  • Zickar, M., Highhouse, S.: Looking closer at the effects of framing on risky choice: An item response theory analysis. Organ. Behav. Hum Decis. Process. 75(1), 75–91 (1998)

    Article  Google Scholar 

  • Zytnick, D., Park, S., Onufrak, S.J.: Child and caregiver attitudes about sports drinks and weekly sports drink intake among US youth. Am. J. Health Promotion 30(3), e110–e119 (2016)

    Article  Google Scholar 

Download references

Acknowledgements

Funded by the US National Institute of Dental and Craniofacial Research (NIH/NIDCR, grant number R21DE028414-01). Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the NIDCR. We thank our coders (Sean Hinton, Leah Yaker, Hallie Rubinstein, and Julia Sciacca, Charles Zoeller) for their dedication to and effort on this project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Hennessy.

Ethics declarations

Disclosure Statement

No conflicts of interest declared.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hennessy, M., Bleakley, A. & Ellithorpe, M.E. Evaluating and tracking qualitative content coder performance using item response theory. Qual Quant 57, 1231–1245 (2023). https://doi.org/10.1007/s11135-022-01397-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11135-022-01397-7

Keywords

Navigation