Familiar/interactive raters are not always best: The influence of sampling schedules and class of behavior

Braun, Gail Brothers; Paul, Gordon L.; Mariotto, Marco J.

doi:10.1007/BF00960615

Familiar/interactive raters are not always best: The influence of sampling schedules and class of behavior

Published: June 1993

Volume 15, pages 153–176, (1993)
Cite this article

Journal of Psychopathology and Behavioral Assessment Aims and scope Submit manuscript

Gail Brothers Braun¹,
Gordon L. Paul¹ &
Marco J. Mariotto¹

29 Accesses
Explore all metrics

Abstract

Retrospective rating scales are widely used for formal assessment of typical performance. Raters who are the most familiar/interactive with ratees are routinely recommended to maximize the quality of ratings. This caveat to use the most familiar/interactive raters fails to distinguish sampling parameters of the observations on which ratings are based that may be important to assessing different classes of behavior. We hypothesized that systematic observational schedules would be of greater importance to ratings of public events than familiarity/interaction, per se, while the caveat would hold for ratings of private events. We used the Psychotic Inpatient Profile (PIP), which provides separate factor scores for ratings of public and private events, to examine these hypotheses in a quasi-experimental study with adult inpatients of mental hospitals. A large multiinstitutional data set provided retrospective PIP ratings by two types of raters. The most familiar/interactive local clinical staff for each client completed the PIP after observing on an ad lib schedule, along with ongoing job duties. Unfamiliar, noninteractive raters completed the PIP for each client after observing on a systematic time-sampling schedule for purposes of coding an entirely different instrument. Data were selected so that each of 189 clients received PIP scores from four raters, reflecting functioning during the same time period based on day-shift observations by one rater of each type and evening-shift observations by one rater of each type. Analyses of variance, consistency/discriminability of ratings, and prediction of social-action outcomes all supported the hypotheses. We discuss alternative strategies that are better for assessing typical performance in most circumstances. We also provide recommendations for improving the adequacy of observations for those circumstances in which the standardized retrospective rating scale could be a cost-effective assessment strategy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Observed Outcomes: An Approach to Calculate the Optimum Number of Psychiatric Beds

Article 18 February 2019

Cohort Profile: The Precipitating Events Project (PEP Study)

Article 13 March 2020

Research Methodology and Statistics

References

Altmann, J. (1974). Observational study of behavior: Sampling methods.Behaviour, 49, 227–267.
Google Scholar
Campbell, D. T. (1958). Systematic error on the part of human links in communications systems.Information and Control, 1, 334–369.
Google Scholar
Cooper, W. H. (1981). Ubiquitous halo.Psychological Bulletin, 90, 218–244.
Google Scholar
Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972).The dependability of behavioral measurements: Theory of generalizability for scores and profiles. New York: Wiley.
Google Scholar
Endicott, J., & Spitzer, R. L. (1980). Evaluation of psychiatric treatment: Psychiatric ratingc scales. In H. I. Kaplan, A. M. Freedman, & B. J. Saddick (Eds.),The comprehensive textbook of psychiatry (3rd ed., pp. 2391–2409). Baltimore, MD: Williams & Williams.
Google Scholar
Farrell, A. D., & Mariotto, M. J. (1982). A multimethod validation of two psychiatric rating scales.Journal of Consulting and Clinical Psychology, 40, 169–172.
Google Scholar
Favero, J. L., & Ilgen, D. R. (1989). The effects of ratee prototypicality on rater observation and accuracy.Journal of Applied Social Psychology, 19, 932–946.
Google Scholar
Fiske, D. W. (1978).Strategies for personality research. San Francisco: Jossey-Bass.
Google Scholar
Funder, D. C., & Colvin, C. R. (1988). Friends and strangers: Acquaintanceship, agreement, and the accuracy of personality judgement.Journal of Personality and Social Psychology, 55, 149–158.
Google Scholar
Hall, J. N. (1980). Ward rating scales for long-stay patients: A review.Psychological Medicine, 10, 277–288.
Google Scholar
Harris, M. M., & Schaubroeck, J. (1988). A meta-analysis of self-supervisor, self-peer, and peer-supervisor ratings.Personnel Psychology, 41, 43–62.
Google Scholar
Hunter, J. E., & Hunter, R. F. (1984). Validity and utility of alternative predictors of job performance.Psychological Bulletin, 96, 72–98.
Google Scholar
Kane, J. S., & Lawler, E. E. (1978). Methods of peer assessment.Psychological Bulletin, 85, 555–586.
Google Scholar
Kenrick, D. T., & Funder, D. C. (1988). Profiting from controversy: Lessons from the person-situation debate.American Psychologist, 43, 23–24.
Google Scholar
Kraut, A. I. (1975). Prediction of managerial success by peer and training-staff ratings.Journal of Applied Psychology, 60, 14–19.
Google Scholar
Landy, F. J. (1986).Psychology of work behavior (3rd ed.). Homewood, IL: Dorsey Press.
Google Scholar
Landy, F. J., & Farr, J. L. (1980). Performance rating.Psychological Bulletin, 87, 72–107.
Google Scholar
Landy, F. J., & Farr, J. L. (1983).The measurement of work performance: Methods, theory, and applications. New York: Academic Press.
Google Scholar
Latham, G. P., & Wexley, K. N. (1981).Increasing productivity through performance appraisal. Reading, MA: Addison-Wesley.
Google Scholar
Lentz, R. J., Paul, G. L., & Calhoun, J. F. (1971). Reliability and validity of three measures of functioning with “hard-core” chronic mental patients.Journal of Abnormal Psychology, 78, 69–76.
Google Scholar
Licht, M. H. & Paul, G. L. (1987). Replicability of TSBC codes and higher-order scores. In G. L. Paul (Ed.),Observational assessment instrumentation for service and research — The Time-Sample Behavioral Checklist: Assessment in residential treatment settings, Part 2 (pp. 69–94). Champaign, IL: Research Press.
Google Scholar
Lorr, M., & Vestre, N. D. (1985).Psychotic Inpatient Profile Manual (3rd ed.). Los Angeles: Western Psychological Services.
Google Scholar
Lyerly, S. B. (1973).Handbook of psychiatric rating scales (PHS No. 495). Washington, DC: U.S. Government Printing Office.
Google Scholar
Mariotto, M. J., & Licht, M. H. (1986). Ongoing assessment of functioning with DOC systems: Practical and technical issues. In G. L. Paul (Ed.),Principles and methods to support cost-effective quality operations: Assessment in residential treatment settings, Part 1 (pp. 191–222). Champaign, IL: Research Press.
Google Scholar
Mariotto, M. J., Paul, G. L., & Licht, M. H. (1987). Concurrent relationships of TSBC higher-order scores with information from other instruments. In G. L. Paul (Ed.),Observational assessment instrumentation for service and research —The Time-Sample Behavioral Checklist: Assessment in residential treatment settings, Part 2 (pp. 177–210). Champaign, IL: Research Press.
Google Scholar
McReynolds, P., & Ludwig, K. (1984). Christian Thomasius and the origin of psychological rating scales.ISIS, 75, 546–553.
Google Scholar
Morrison, P. B., & Paniagua, F. A. (1990). Assumptions of agreement and familiarity on the Abbreviated Conners Teachers Rating Scale.Behavioral Residential Treatment, 5, 121–127.
Google Scholar
Neimeyer, R. A., Neimeyer, G. J., & Landfield, A. W. (1983). Conceptual differentiation, integration and empathic prediction.Journal of Personality, 51, 185–191.
Google Scholar
Paul, G. L. (Ed.) (1979). New assessment systems for residential treatment, management, research, and evaluation.Journal of Behavioral Assessment, 1, 181–184.
Google Scholar
Paul, G. L. (1986). The nature of DOC and QICS encoding devices. In G. L. Paul (Ed.),Principles and methods to support cost-effective quality operations: Assessment in residential treatment settings, Part 1 (pp. 63–112). Champaign, IL: Research Press.
Google Scholar
Paul, G. L. (1987). Discriminations of TSBC higher-order scores among groups differing on clinically relevant characteristics. In G. L. Paul (Ed.),Observational assessment instrumentation for service and research — The Time-Sample Behavioral Checklist: Assessment in residential treatment settings, Part 2 (pp. 147–176). Champaign, IL: Research Press.
Google Scholar
Paul, G. L., & Mariotto, M. J. (1986). Potential utility of the sources and methods: A comprehensive paradigm. In G. L. Paul (Ed.),Principles and methods to support cost-effective quality operations: Assessment in residential treatment settings, Part 1 (pp. 113–164). Champaign, IL: Research Press.
Google Scholar
Paul, G. L., & Mariotto, M. J. (1987). Predictive relationships of TSBC higher-order scores to other measures of performance and outcomes. In G. L. Paul (Ed.),Observational assessment instrumentation for service and research — The Time-Sample Behavioral Checklist: Assessment in residential treatment settings, Part 2 (pp. 211–236). Champaign, IL: Research Press.
Google Scholar
Paul, G. L., Mariotto, M. J., & Redfield, J. P. (1986). Sources and methods for gathering information in formal assessment. In G. L. Paul (Ed.),Principles and methods to support cost-effective quality operations: Assessment in residential treatment settings, Part 1 (pp. 27–62). Champaign, IL: Research Press.
Google Scholar
Paul, G. L., Licht, M. H., Power, C. T., & Engel, K. L. (1987). The data base for TSBC evidence and normative comparisons. In G. L. Paul (Ed.),Observational assessment instrumentation for service and research — The Time-Sample Behavioral Checklist: Assessment in residential treatment settings, Part 2 (pp. 51–68). Champaign, IL: Research Press.
Google Scholar
Paunonen, S.,V. (1989). Consensus in personality judgments: Moderating effects of target-rater acquaintanceship and behavioral observability.Journal of Personality and Social Psychology, 56, 823–833.
Google Scholar
Rich, B. E., Paul, G. L., & Mariotto, M. J. (1988). Judgmental relativism as a validity threat to standardized psychiatric rating scales.Journal of Psychopathology and Behavioral Assessment, 10, 241–257.
Google Scholar
Rorer, L. G. (1990). Personality assessment: A conceptual survey. In L. A. Pervin (Ed.),Handbook of personality theory. New York: Guilford Press.
Google Scholar
Rothstein, H. R. (1990). Interrater reliability of job performance ratings: Growth to asymptote level with increasing opportunity to observe.Journal of Applied Psychology, 75, 322–327.
Google Scholar
Saal, F. E., Downey, R. G., & Lahey, M. A. (1980). Rating the ratings: Assessing the psychometric quality of rating data.Psychological Bulletin, 88, 413–428.
Google Scholar
Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: Uses in assessing rater reliability.Psychological Bulletin, 86, 420–428.
Google Scholar
Thompson, C. (Ed.). (1989).The instruments of psychiatric research. New York: Wiley.
Google Scholar
Tsui, A. S., & Ohlott, P. (1988). Multiple assessment of managerial effectiveness: Interrater agreement and consensus in effectiveness models.Personnel Psychology, 41, 779–803.
Google Scholar
Wiggins, J. S. (1988).Personality and prediction: Principles of personality assessment. Malabar, FL: Krieger.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Psychology, University of Houston, 77204-5341, Houston, Texas
Gail Brothers Braun, Gordon L. Paul & Marco J. Mariotto

Authors

Gail Brothers Braun
View author publications
You can also search for this author in PubMed Google Scholar
Gordon L. Paul
View author publications
You can also search for this author in PubMed Google Scholar
Marco J. Mariotto
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

This study was the basis of a master's thesis at the University of Houston by the senior author under the direction of the junior authors. Richard M. Rozelle served on the examination committee. This study was partially supported by grants to Gordon L. Paul from the National Institute of Mental Health, Public Health Service (MH-15353; MH-25464); the Illinois Department of Mental Health and Developmental Disabilities; the Joyce Foundation; the MacArthur Foundation; the Owsley Foundation; the Cullen Foundation; and the Center for Public Policy of the University of Houston.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Braun, G.B., Paul, G.L. & Mariotto, M.J. Familiar/interactive raters are not always best: The influence of sampling schedules and class of behavior. J Psychopathol Behav Assess 15, 153–176 (1993). https://doi.org/10.1007/BF00960615

Download citation

Accepted: 12 May 1993
Issue Date: June 1993
DOI: https://doi.org/10.1007/BF00960615

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Familiar/interactive raters are not always best: The influence of sampling schedules and class of behavior

Abstract

Access this article

Similar content being viewed by others

Observed Outcomes: An Approach to Calculate the Optimum Number of Psychiatric Beds

Cohort Profile: The Precipitating Events Project (PEP Study)

Research Methodology and Statistics

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Key words

Navigation

Familiar/interactive raters are not always best: The influence of sampling schedules and class of behavior

Abstract

Access this article

Similar content being viewed by others

Observed Outcomes: An Approach to Calculate the Optimum Number of Psychiatric Beds

Cohort Profile: The Precipitating Events Project (PEP Study)

Research Methodology and Statistics

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation