Assessing the quality of Objective Structured Clinical Examination (OSCE) reports in pharmacy education: A review of the literature

Objective: The purpose of this systematic review was to identify and assess the quality of published reports of objective structured clinical examinations (OSCEs) in pharmacy education. Methods: English-language articles published between 2000-2015 describing OSCEs in pharmacy education were included. Search terms included ‘pharmacy education’, ‘objective structured clinical examination’, ‘clinical skills assessment’, and ‘clinical skills examination’. A previously published checklist of reporting standards for OSCEs in medical education was used to assess the quality of published OSCE reports in specific areas: encounter characteristics, standardized participant (SPs) characteristics, training methods, and behavioral methods. Results: Forty-two articles were identified for inclusion. Forty (95%) articles reported the number of encounters with the median being 3.0 (SD=4.7). All articles reported the level of learner participating, with most US pharmacy schools conducting OSCEs with third year pharmacy students. Most articles reported training for SPs prior to the OSCE (n=23, 55%), however most did not report if training was provided to OSCE raters (n=22, 52%). Thirty-six (86%) articles reported the type of behavior measure used during OSCEs, however almost half (n=18, 42%) did not report the scale used. Few articles (n=19, 45%) reported if psychometric properties had been obtained for the specific behavior measures used. Conclusions: Overall, published descriptions of OSCEs are varied and inconsistent. Most OSCEs conducted in US pharmacy schools were in third year pharmacy students. Just over half of published articles reported OSCE training. The lack of published psychometric properties may make implementation of valid OSCEs difficult.


Introduction
Objective structured clinical examinations (OSCEs) were first described in medical education in the 1970s. (Harden, 1979) OSCEs are designed as a test of clinical competence with 'focused attention being paid to the objectivity of the examination' and typically consist of multiple, standard stations that prompt students to perform specified tasks within a defined amount of time. (Harden, 1989) OSCEs evaluate clinical activities and typically involve the use of standardized participants (SPs). SPs are defined as individuals instructed to simulate participants (e.g. patients, providers) in specific scenarios to evaluate learners' clinical skills. OSCEs have been extensively implemented and studied in medical education and literature. More recently, the OSCE has been used to evaluate student performance across a variety of health profession disciplines, including pharmacy. (Cannick et al., 2007, McWilliam and Botwinski, 2012, Pender and de Looy, 2004 In addition, an OSCE has been a required component of the national Canadian pharmacist entry-to-practice licensing examination for over a decade. (Austin et al., 2003) A recent review by Sturpe examining the use of OSCEs in US colleges and schools of pharmacy found out of 88 pharmacy programs sampled, 32 programs reported using OSCEs and approximately half of those not using an OSCE were interested in using the technique. (Sturpe, 2010) The Accreditation Council of Pharmacy Education (ACPE) recognizes the use of variety of different strategies, including OSCEs, to assess student performance-based achievement of educational outcomes and encourages pharmacy programs to stay current with research practices in these areas. (ACPE, 2015) Despite the widespread recognition and utilization of OSCEs in medical education, descriptions vary widely in the literature. In 2008, representatives of the Association of Standardized Patient Educators developed standards for information that should be reported in published SP descriptions in the medical literature and subsequently reviewed available published reports. (Howley et al., 2008) This report by Howley and colleagues found that increased rigor in SP descriptions is needed as published reports in the medical literature lacked adequate detail required to support research findings. Reporting standards suggested by Howley and colleagues included four categories: activity or encounter, standardized patient characteristics, training, and behavior measures. The review of OSCEs in medical literature by Howley and colleagues was limited to SPs. It should be noted that conducting an OSCE does not require the use of SPs, but in the above-mentioned report of OSCE practices in colleges and schools of pharmacy, all programs currently conducting OSCEs utilized SPs. (Austin et al., 2003) The editors of several pharmacy education journals have also called for an increase in detail in submitted manuscripts. Recent editorials have noted that sufficient detail regarding the methods and results must be provided for replication and verification of research across the Academy. (Persky andRomanelli, 2016, Janke K., 2014) An assessment of the quality of OSCE published reports in pharmacy education is lacking in the literature. The purpose of this systematic review was to identify published reports of OSCEs in pharmacy education and assess the quality of those reports based on a previously published checklist of OSCE reporting standards.

Methods
This review was designed and conducted using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. (Moher, 2009) Since research involved a systematic review of published data, no approval was required from the College's institutional review board. Studies were included if they were original articles published in English from 2000 to 2015 that described objective structured clinical examinations in pharmacy education. As of July 1, 2000, the Doctor of Pharmacy (i.e. Pharm.D.) became the sole degree accredited by the Accreditation Council of Pharmacy Educators (ACPE) for entry into pharmacy practice. The 'Pharm.D.' curriculum requires an additional year of study in comparison to the Bachelor of Pharmacy degree. This additional year includes courses in pharmacotherapy, patient care and emphasizes clinical knowledge and skills. (Kreling et al., 2010)  Register of Controlled Trials, Cochrane Database of Systematic Reviews, Cochrane Methodology Register, International Pharmaceutical Abstracts, ERIC, DoPHER, Web of Science, and Global Health. Search terms included 'pharmacy education', 'objective structured clinical examination', 'clinical skills assessment', and 'clinical skills examination'. Relevant article reference lists were used to obtain additional articles. Both study authors evaluated article titles and abstracts for inclusion. The full text of these articles was then reviewed for final inclusion. The definition of OSCEs used in medical literature was used to develop characteristics of OSCEs for articles to be included in this review. (Harden, 1989) This list of characteristics was developed because activities assessing clinical skills of pharmacy students could have been published without the specific use of the term OSCE which would have led to the inadvertent exclusion of relevant articles. To be included in this review, reports needed to describe assessments that 1) were objective measures of clinical skills, 2) were standardized across testing environments and 3) were conducted in pharmacy students. Because OSCEs do not require the use of SPs, descriptions of OSCEs with or without incorporation of SPs were included in this review. A sub-analysis of those articles using SPs was performed. Articles describing the use of OSCEs with pharmacy school graduates or in post-graduate training programs were excluded from this systematic review.
The methodological quality of each article was assessed based on checklist developed using the reporting standards for SPs in medical literature by Howley et al. (Howley et al., 2008) Due to the widespread use of SPs in pharmacy education, minimal modification was needed to apply the standards to OSCEs in pharmacy education. (Table 1 Data was extracted from individual reports by a data collection tool which was adapted from the previously published checklist. (Howley et al., 2008) The data collection tool was developed, pilot-tested, and refined by the study authors and included the following sections: encounter characteristics; SP characteristics; training; and behavioral measure(s) ( Table 1). These items represent the minimum standards that should be included in reports of OSCEs in pharmacy education to ensure replicability, validity, and reliability. Study authors individually collected data through full text review of the articles and discrepancies were resolved through consensus. Frequencies were calculated for all checklist items.

Results
A total of 637 articles were identified for preliminary screening from the use of the developed search strategy and identification via reference lists ( Figure 1). total of 400 articles were excluded after screening titles and abstract because they did not use OSCEs to assess pharmacy student performance. The full text of 66 articles were reviewed with a total of 42 articles identified for inclusion in this review (Table 2). Inter-rater agreement of study inclusion was assessed through calculation of Cohen's kappa with 0=no agreement and 1=total agreement. Study authors had strong inter-rater agreement for study inclusion (kappa = 0.755). A total of 7 reports did not use SPs to assess student performance of clinical skills. As the use of SPs is not a required component of an OSCE, these studies were included in the review except for the description of SP characteristics.

Encounter characteristics
Information gathered about encounter characteristics included the stakes of the OSCE (e.g. high or low stakes), structure of the OSCE (e.g. formative, summative, or both), level of the learner participating, number and lengths or encounters, number of students per SP, number of SP per case, length of the work-day for SP, whether there was an inter-station activity related to information obtained in the SP encounter, length of the inter-station activity, and who provided feedback to the learner. Twenty-two (76%) articles reported the level of stakes of OSCEs, with 13 (31%) reports identifying OSCEs as high stakes, 16 (38%) reported low stakes for the OSCEs, 2 (5%) were identified as having a combination of low and high exam stakes, and 1 (2%) OSCE asked students to volunteer as participants. Thirty-seven (88%) articles reported the level of OSCEs as either summative or formative or both, with the majority including summative assessments (52%, n = 22). All articles reported the level of learner participating in the OSCE, with seven articles (17%) reporting multiple levels of learners (e.g. first-year and second-year students) participating in OSCEs. Of the articles from the United States, most reports (26%, n=11) detailed OSCEs conducted in third-year students. Nineteen (45%) articles reported OSCEs conducted in non-US pharmacy schools. Forty (95%) articles reported the number of encounters in the OSCE with the average number of encounters per OSCE being 3.0 (SD=4.7); however, the number of encounters was reported over a large span with a range of 1-26. Thirty-two (76%) articles reported the lengths of encounters with four (10%) articles reporting a combination of encounter lengths. Eight studies (35%) reported whether feedback was given to students with five of those studies stating that no feedback was provided and three reporting that SPs provided feedback to students.

Training methods
Information gathered about training characteristics included whether training was provided for the OSCE rater and/or the SP, length of the training provided, who conducted the training, whether there were any quality control checks contained in the training, and the content of the training. Nineteen articles (45%) reported OSCE raters were trained before evaluating students. Of the 35 reports OSCEs utilizing SPs, twenty-four out of the articles (69%) reported if SPs were trained prior to the OSCE. Thirty-three articles (79%) did not report the amount of time for training for either SPs or raters. Twenty-six articles (62%) did not report who provided the training. Most training was provided by faculty (n=9, 21%). Fifteen articles (36%) failed to report content of OSCE training for either raters or SPs. Most articles (n=34, 81%) did not report if OSCE training contained any quality control checks.

Behavioral measures
Information gathered about behavioral measure characteristics included the type of the behavior measure used (e.g. checklist, rating scale, global impression scale), the scale of the behavior measure (e.g. Likert, dichotomous), the number of items contained in the behavior measure, mode for recording the rater's assessment (e.g. paper, electronic), the content of the behavioral measure, the timing of administration of the behavior measure, and whether the purpose of the behavior measure was reported. Thirty-six (86%) articles reported the type of behavior measure used to evaluate students during the OSCE. A variety of behavioral measures were used to assess student learning during OSCEs including checklists only (n=14, 33%), rating scale only (n=5, 12%), rubric only (n=6, 14%), open-ended questions (n=1, 2%), checklist and rubric (n=5, 7%), and checklist and global impression scale (n=7, 17%). Six articles (14%) did not report the type of behavior measure used to assess pharmacy students during OSCEs. A variety of scales were used in behavior measures as well including Likert scales only (n=7, 17%), dichotomous scales only (n=7, 17%), dichotomous and Likert scales (n=4, 10%), and other scales (n=6, 14%). Almost half of OSCE articles (n=18, 42%) did not report the scale used in the behavior measure. Most articles (n=21, 50%) did not report the number of items in the behavior measure or how the behavior measure was recorded (n=39, 993%). OSCEs assessed a range of content including interpersonal skills only (n=8, 19%), procedures only (n=3, 7%), interpersonal and clinical skills (n=15, 336%), and interpersonal and procedural skills (n=3, 7%). Fifteen articles (36%) did not report the content of the behavior measure.

Standardized participant characteristics
Seven (17%) articles reported that SPs were not used so these reports were excluded from SP characteristics results leaving 35 articles using SPs in their OSCEs. No studies reported SP age, race or gender. Thirty-two (91%) articles reported the role of the SP with 15 articles (43%) reporting SPs served solely as a simulator and 17 articles (47%) reporting SPs served as a combination of simulator and evaluator. Fourteen articles reported the SP work duration with six articles (17%) reporting SPs worked for a partial day, seven articles (20%) reporting SPs worked for a full day, and one article (2.9%) reported that SPs worked more than a full day. Seven articles (20%) reported how many individual SPs portrayed an individual case. Four articles (11%) reported SPs had prior experience as SPs and two articles (6%) reported SPs had no prior experience as SPs. Five articles (14%) reported SPs had relevant background or outside experience and four articles (11%) reporting SPs had no relevant background or outside experience.

Discussion
This systematic review identified published reports of OSCEs in pharmacy education and evaluated the quality of those articles using a previously published checklist of minimum reporting standards. Overall, the descriptions of OSCEs were varied and inconsistent. The data collection tool was used to identify minimum reporting standards, yet many data elements were not identified in published articles (Figure 2). The encounter characteristics standards relate to the structure of the instructional activity for which performance of the pharmacy student is assessed as an educational outcome. These standards include details of high-or low-stakes evaluation, formative or summative assessment, any feedback provided, and the level of the learner. Our results indicate less than 50% of authors report whether feedback is provided, whether a post-encounter is included within the exercise, the number of students who are seen by an individual SP, the number of SPs portraying a case, or the work hours of SPs. Most published OSCEs were conducted in third year pharmacy students. The timing of these activities may assist pharmacy programs in evaluating a student's mastery of clinical skills or readiness for Advanced Pharmacy Practice Experiences (APPEs). With the recent release of the American Association of Colleges of Pharmacy's core entrustable professional activities (EPAs) for new pharmacy graduates, colleges and schools of pharmacy may be able to use OSCEs as one means to demonstrate the 'sufficient competence' needed before students can perform EPAs in an unsupervised fashion. (Haines et al., 2017)  Almost half of published reports were from countries outside of the United States which demonstrates the broad appeal of this type of assessment in pharmacy education. Many of these reports originated from countries (e. g. Malaysia, Japan, Brazil) who may be trying to increase the clinical component within their curriculum and clinical competence of their pharmacy graduates.
Standardized participant reporting criteria should describe age, race, gender, role, and experience level, and background of the SP participating in the encounter. No published OSCE report detailed the demographics of standardized participants. Current ACPE accreditation standards require pharmacy students to be exposed to a 'diverse population' during APPEs. (ACPE, 2015) OSCEs are time and labor intensive endeavors and therefore not all programs may have ability to maintain a robust standardized participant program or recruit standardized participants that reflect diverse populations. However, students and faculty value the contribution of SPs in pharmacy education and thus, it may be beneficial for pharmacy programs to collect SP demographics to as a quality control measure in the context of pre-APPE readiness. (Smithson et al., 2015) Training methods standards involve outlining the training provided to SPs and other participants who either portray a role in the OSCE; evaluate and rate the OSCE; or provide feedback to the students, whether oral and/or written. To remain an objective assessment of a student's clinical abilities, SPs must be able to 'repeat their story on the required Elrod S, Bullock K MedEdPublish https://doi.org/10.15694/mep.2018.0000257.1 Page | 14 number of occasions and keep it fairly consistent'. (Harden, 1979) It is unclear how authors were able to ensure this consistency considering that over two-thirds of manuscripts using SPs did not report whether training was conducted prior to the examination and few articles discussed any quality checks contained in training for SPs or raters. Additionally, raters should agree on what is expected of the student prior to the examination. (Harden, 1979) However, less than half of published reports documented if raters received any training prior to grading students.
The behavioral measures standards provide detail on the specific instrument utilized to measure the learner's performance during the OSCE encounter. Ideally, the actual measurement tool should be included in the report of the OSCE, otherwise information related to the type, scale, number of items, content, purpose, and administration time of the behavioral measure should be reported. However, almost one-third of articles did not report the administration time with even more reports (42%) failing to include the scale of the behavioral measure.
A wide array of content was seen in the published reports. The variety of assessments conducted in pharmacy programs speaks to the breadth of activities a pharmacist is expected to perform upon graduation. It is also important to note that pharmacists perform clinical activities that may or may not directly involve another party (e.g. patient, prescriber); which contrasts with OSCEs in medical education for which an SP is an inherent component. This likely explains why seven OSCE reports did not involve SPs. For example, extemporaneous and sterile compounding require clinical skills and procedural knowledge, but the demonstration of these skills does not necessarily require the use of a standardized patient or prescriber. Therefore, assessment of clinical activities that do not directly include SPs remain valuable contributions to pharmacy education.
Psychometric measures such as reliability and validity are fundamental to education and psychological measurement. (Peeters et al., 2013) Documentation of reliability and validity measures are essential to description of a 'quality' OSCE and help to increase the rigor of the scholarship of teaching and learning. Without these and other elements, pharmacy educators risk implementing assessments that are not internally consistent, do not measure the same construct over time, and are not measuring the construct intended by the instructor. Future OSCE publications should detail psychometric properties to assist other pharmacy educators with implementation of valid assessments. Publication of work in academic literature should aim to support and be a source for scholarly teaching by achieving sufficient quality to be reproducible. This standard serves to expand the knowledge base of pharmacy educators investigating OSCEs or other content areas.
Lastly, the number of published OSCE reports has increased dramatically in recent years ( Figure 3). Several factors may be contributing to this finding. The review regarding OSCEs in US pharmacy programs by Sturpe and colleagues was conducted in 2010. At that time, approximately half of the programs not using OSCEs were interesting in the technique. Some of these colleges and schools of pharmacy may have published results of implementing and/or expanding OSCEs within their curriculum. Additionally, updated ACPE Standards were released in 2015. (ACPE, 2015) The heightened emphasis on assessment throughout the accreditation standards supports OSCES as one mechanism for evaluation of clinical skills among pharmacy students. As OSCEs can be utilized to fulfill the need for assessment of learning outcomes at the student, course, and programmatic level quality of the published reports should be assessed and of a standardized nature.

Conclusion
Overall, published descriptions of OSCEs are varied and inconsistent. Most OSCEs conducted in US pharmacy schools were in third year pharmacy students, which may demonstrate pre-Advanced Pharmacy Practice Experience readiness. No studies reported demographics of standardized participants. The number of published OSCE reports in pharmacy education. has increased in recent years. Few studies reported OSCE training or psychometric properties of assessment tools which may make implementation of valid OSCEs difficult.

Take Home Messages
The overall quality of published OSCE reports in pharmacy education is inconsistent Many published reports detail OSCEs in pharmacy schools outside of the United States Most OSCEs in pharmacy education are conducted in third year pharmacy students The majority of OSCEs in pharmacy education are summative assessments Most published OSCE reports do not report psychometric properties such as reliability and validity