Incentive structure in team-based learning: graded versus ungraded Group Application exercises

Purpose: Previous studies on team-based learning (TBL) in medical education demonstrated improved learner engagement, learner satisfaction, and academic performance; however, a paucity of information exists on modifications of the incentive structure of “traditional” TBL practices. The current study investigates the impact of modification to conventional Group Application exercises by examining student preference and student perceptions of TBL outcomes when Group Application exercises are excluded from TBL grades. Methods: During the 2009–2010 and 2010–2011 academic years, 175 students (95.6% response rate) completed a 22-item multiple choice survey followed by 3 open response questions at the end of their second year of medical school. These students had participated in a TBL supplemented preclinical curriculum with graded Group Application exercises during year one and ungraded Group Application exercises during year two of medical school. Results: Chi-square analyses showed significant differences between grading categories for general assessment of TBL, participation and communication, intra-team discussion, inter-team discussion, student perceptions of their own effort and development of teamwork skills. Furthermore, 83.8% of students polled prefer ungraded Group Application exercises with only 7.2% preferring graded and 9.0% indicating no preference. Conclusion: The use of ungraded Group Application exercises appears to be a successful modification of TBL, making it more “student-friendly” while maintaining the goals of active learning and development of teamwork skills.


INTRODUCTION
Originally developed for large business school classes, teambased learning (TBL) is a well-defined instructional strategy employed in numerous medical schools in the United States [1] and across the globe [2,3].TBL uses a structured process that relies on individual and group accountability to promote mastery of factual knowledge, development of cognitive skills, and integration of information from multiple sources in addition to active problem solving and cohesive teamwork.TBL employs a three-phase strategy: (1) advance preparation: stu-dents independently study a preparatory assignment; (2) readiness assurance: students demonstrate mastery of the assignment through Individual and Group Readiness Assurance Tests (IRATs/GRATs); and (3) application: teams apply knowledge to problem-solving exercises, termed Group Application (GApp) exercises, in which teammates work to reach a consensus answer, followed by whole-class discussion and debate over the best solution to the problem [4].In addition to RATs, individual accountability is promoted by peer evaluation of each team member's contribution to team productivity.
Recently, Haidet et al. [5] proposed the core design elements of TBL.These include team formation, readiness assurance, immediate feedback, sequencing of in-class problem solving, four Ss (significant problem, same problem, specific choice, and simultaneous reporting), incentive structure, and peer review.This study will focus on the incentive structure of TBL, http://jeehp.orgJ Educ Eval Health Prof 2014, 11: 6 • http://dx.doi.org/10.3352/jeehp.2014.11.6 which is thought to be a critical component of the learning process.In the aforementioned conceptual model of TBL, grading individual performance motivates preparation prior to the TBL exercise and grading team performance will provide motivation to cohesively collaborate and engage with peers for maximal success [5].Previous studies on TBL in medical education demonstrate improved learner engagement [2], learner satisfaction [6,7], emotional intelligence [8], and academic performance with particular benefit for lower-performing students [9].To our knowledge, there is a void in the literature regarding incentive structure and its relation to TBL.While graded IRATs and GRATs promote thorough preparation, the cornerstone of each TBL module is the GApp exercise.In the 2009-2010 academic year, the study institution moved from a graded GApp exercise to an ungraded GApp exercise in the year two curriculum, eliminating team grades as a motivator for students to actively participate in group problem solving.This change provided the unique opportunity to assess students' perceptions of graded versus ungraded GApps on the overall TBL experience as well as identify specific factors that contribute to student grading preference.

Team-based learning at study site
At the study site preclinical instruction modalities include lecture, laboratory exercises, clinical case discussions, online/ independent study modules, audience response sessions, and TBL.While lecture is the most frequent teaching method used by faculty, TBL modules heavily supplement the core curriculum.Student assessment is primarily accomplished through summative course examinations accounting for approximately 60%-95% of the overall course grades.Individual and group TBL scores, including peer evaluation, account for most of the remaining 5%-40%, with several courses including additional graded assessments.
At the beginning of each academic year, first and second year students are randomly sorted by faculty into teams of 5-7, ensuring diverse backgrounds and experiences among team members.Students remain in these teams through all courses in an academic year and participate in > 25 TBL sessions per year.Sessions last approximately 2-3 hours, with GApps accounting for approximately 80-120 minutes.The RAT and GApps are created by a faculty content expert and independently reviewed by at least one additional faculty member.
For each TBL module, advance assignments are given prior to the TBL session (phase 1) and may include textbook readings, lecture notes, journal articles, and independent study modules.TBL sessions begin with a 10-item multiple-choice IRAT covering material from the advance assignment.Imme-diately following the IRAT, each team takes the same RAT as a group.TBL facilitators provide immediate feedback on RAT performance and clarify with brief discussion all RAT answers, completing phase 2. The remainder of the TBL session consists of phase 3, the GApp exercise.
GApp exercises consist of clinically-based case scenarios paired with a series of multiple choice questions requiring extensive problem solving and critical reasoning.Questions focus on diagnostic and therapeutic decision-making with emphasis on basic and clinical science integration [10].Teams are permitted to use reference materials while reaching consensus on GApp answer choices, but consultation across teams is not permitted.Following each question, teams simultaneously reveal answer choices.If all teams choose the same correct answer, the GApp proceeds to the next question.If teams disagree over the best answer, TBL facilitators lead class discussion/debate until resolution of important concepts is achieved.
The TBL strategy ensures that students receive immediate feedback on the RATs, are forced to reach team consensus on application exercises, and simultaneously report and defend their team decisions.High levels of team functioning as well as effective interpersonal communication are critical to strong performance in TBL modules.Individual and team scores are incorporated into each student's course grade.For the population studied, an individual student's TBL score consisted of 30% IRAT, 30% GRAT, and 40% GApp during year one and 40% IRAT, 60% GRAT, and 0% GApp during year two.During year one, peer evaluation occurs at the end of each of the four 10-week courses and serves as TBL grade multiplier.During year two, peer evaluations occur at the end of each academic term and are incorporated into term 1 and term 2 exam grades.

Measures
After a review of literature to assess intended outcomes of TBL, a 22-item multiple choice survey was developed to measure perceived impact of graded vs. ungraded GApp exercises on the student TBL experience (see Table 1 for survey questions) and reviewed by several faculty members at the study institution.Each item prompted students to consider the likelihood of specific TBL outcomes with graded or ungraded GApp exercises.Items were classified into 6 domains-general assessment, participation & communication, intra-team discussion, inter-team discussion, perceived effort, teamwork skills.Study participants were unaware of categories and items were randomly distributed throughout the survey.Answer choices for each item were as follows: (1) graded GApp exercises; (2) both graded and ungraded GApp exercises; and (3) ungraded GApp exercises.Response proportions were compared to the expected proportions (i.e., 33.33%) using a one sample chi-square analysis.Significance was set to P < 0.002 using a Boneferroni correction.Here, a significant P-value indicates that there is a statistically significant association between students' perception of their TBL experience and grading category.Additionally, three open response questions ("Which Group Application exercise format do you prefer?", "What is the reason(s) for the preference stated in the previous question?",and "Which component(s) of TBL help you with remembering course material?")were included to provide further insight into students' subjective experiences in TBL with regard to GApp exercises as well as generating knowledge outcomes.Responses to open response questions were categorized for quantification during post hoc analysis of survey data.

Procedures
The population selected for this study participated in a TBLsupplemented preclinical curriculum with graded GApp exercises during Year One and ungraded GApps during year two at one United States Medical School.The study institution employs a graded curriculum (i.e., not pass/fail).With Institutional Review Board approval, the survey was completed at the end of the second year by consecutive classes in academic years 2009-2010 and 2010-2011.As a measure of advance preparatory effort, second year IRAT scores were compared for classes in which GApp exercises were graded (class of 2011) and ungraded (class of 2010).Only scores from TBL modules in which identical RATs were administered were included in analysis.

RESUTLS
One hundred seventy-five out of 183 students (95.6% response rate) completed the survey.Response rates for the three open response questions were 35.5% (n = 65), 32.2% (n = 59), and 29.5% (n = 54), respectively.Chi-square analyses were used to determine whether differences in graded versus ungraded GApp exercises impacted students TBL experience.Results showed statistically significant differences (P < 0.05; Boneferroni correction P < 0.002) on 20 of 22 survey items.Percentage of responses and chi-square results for graded, ungraded, or both graded and ungraded GApp exercises are reported in Table 1.Significant differences between grading categories were noted for (1) general assessment of TBL, (2) student perception of their own participation and communica-  Because students were free to list several reasons, the sum of percentages may be > 100%.
tion during TBL, (3) the facilitation of intra-team discussion, (4) the facilitation of inter-team discussion, (5) student perceptions of their own effort, and (6) development of teamwork skills.
To uncover why students prefer graded or ungraded GApp exercises, survey results were further analyzed.Chi-square analysis of responses from only students selecting 'ungraded' for survey item 4 (i.e., those students who responded as 'preferring ungraded' GApp exercises; n = 140), shows a significant association between ungraded GApp exercises and these 140 students' perception of improved inter-team discussion (Table 2).On the other hand, chi-square analysis of responses from only students selecting 'graded' for survey item 4 (i.e., only those students who responded as 'preferring graded' GApp exercises; n= 12), shows a significant association between grad-ed GApp exercises and these 12 students' perception of improved effort (Table 3).However, despite perceived differences in preparatory effort among these 12 students, IRAT data shows students are equally prepared coming into a TBL module regardless of GApp grade weight (Table 4).
Open response questions 1-3 provide additional insight into reasons for student GApp exercise preference (data shown in Tables 5-7, respectively).Notably, the percentages reported in Table 5 to open response question 1 ("Which Group Application exercise format do you prefer?") corroborate well with those in Table 1 (item 4).Table 6 shows that students preferring ungraded GApp exercises in open ended question 1 (n= 55) listed reduced stress, improved discussion, increased efficiency, and an improved learning environment in response to open response question 2 ("What is the reason(s) for the preference stated in the previous question?"),while those students preferring graded (n = 4) listed improved effort/motivation as reasons for their preference.Together, these data indicate that a large majority of students prefer ungraded GApp exercises due to perceived decreases in stress leading to an improvement in the quality of group discussion and improved learning during TBL.Note also that a large majority of those students preferring graded GApp exercises listed group discussion as the most helpful TBL component for remembering course material (Table 7).Conversely, a minority of students prefer graded GApp exercises due to a perceived increase in effort leading to more motivation and better focus during TBL.Pertinent responses to three open response questions are included in the discussion section.

DISCUSSION
The past several decades have seen a paradigm shift in medical education.The call for learner-centered pedagogy has created numerous strategic innovations in curricular delivery.TBL is an innovative tool to supplement traditional lecturebased undergraduate medical curricula with active learning.GApp exercises challenge students to cooperatively analyze difficult clinical cases while incorporating important principals of basic science, population health, and medical ethics.Indeed, meaningful and well-constructed GApp exercises are the cornerstone of a TBL module and are largely responsible for TBL efficacy, which is noted for improved academic performance [11], professional development, emotional intelligence [8], and student satisfaction [2,7].The current study investigates the incentive structure of TBL through a modification of conventional GApp exercises by examining student preference and student perceptions of TBL outcomes when GApp exercises are excluded from TBL grades.
Importantly, student survey data in the present study indicate the perceived effectiveness of GApp exercises in generating knowledge outcomes, developing teamwork skills, and preparing students for clinical clerkships is, in a large part, independent of grade weight.Moreover, IRAT grades did not differ significantly (Table 4) after the study institution removed grades from GApp exercises in the second year curriculum, demonstrating undiminished student advanced preparation or independent learning prior to TBL.Yet still, 83.8% of students polled prefer ungraded GApp exercises with only 7.2% preferring graded.
Of students who prefer ungraded GApp exercises, the majority cited a better learning environment during the TBL module with reduced feelings of stress and anxiety and improved group discussion.Data from open response questions further suggests that most students learned more from group discussion with ungraded GApp exercises due to the effective elimination of extraneous discussion for the sake of grades.Relevant responses include the following: "This year (with ungraded application exercises) was way less stressful and more beneficial to my learning." "Application exercises test deeper understanding of materials.They sometimes go beyond what is 'testable, ' so being ungraded facilitates discussion without the burden/stress of grades." "The ungraded format is much better because people spend less time arguing and more time learning." "I feel more ideas were discussed and it made difficult concepts easier to understand because no one was out for pointsonly education." "There is less stress so emphasis is placed on learning to learn instead of learning for a grade.It also reduces animosity in teams who get an answer wrong." "This year (with ungraded GApp exercises) our group was interested not necessarily with arriving at the correct answer (though we deliberated thoroughly and seriously to have sound support for our answer choice), but we were instead most concerned with learning.People are more open to correction when not fighting for a grade." It is interesting, in light of findings that medical student stress peaks during the first two years of medical school [12], that the majority of second year medical students surveyed in this study indicate an overwhelming preference for ungraded GApp exercises due to reduced feelings of stress and anxiety.While stress is inherent and unavoidable in medical training and practice, it is during the first two years of medical school when students are uprooted from their friends and family and face increased scholastic workload, competition from peers, and high stakes standardized testing that psychological distress can overwhelm and disengagement coping strategies become prevalent.Overwhelming stress in students relying on unhealthy coping mechanisms may manifest as depression, substance abuse, decreased altruism/empathy, unprofessional conduct, and in a word, burnout [13].
There is a paucity of information about curricular contributions to medical student stress.In developing pedagogical innovations, medical educators must be cognizant of and sensitive to any unintended effect on the learning environment and student well-being.Although several studies have shown that students prefer TBL to other 'learner-centered' educational approaches [6,7], our data may suggest that in highly motivated students, such as medical students, extrinsic incentives to participate (i.e., grades) may have the adverse effect of unnecessarily increasing stress levels without concomitant increases in knowledge or other outcomes.Relevant insights from open response questions include the following: "Medical school is full of stress and the added stress of graded applications are unnecessary and do not significantly contribute to better learning." "(With ungraded GApp exercises) I can think clearly.I feel better about speaking up and learn more….People still take it seriously even though it isn't graded." "(Ungraded GApp exercises) facilitates learning in an environment where it is acceptable to fail.Just because the application isn't graded doesn't mean students don't prepare for it." "Learning environment is amazing (with ungraded GApp exercises) and fosters excellent discussion because we still feel pressure to support our answers." "I give full effort regardless of (GApp exercise) grade weight." While stress can arouse feelings of fear and anger, not all medical students find it unconstructive.Indeed, it can be a powerful motivator.A small minority of students in this study preferred graded GApp exercises despite any perceived increase in stress.These students felt motivated to put forth more effort if the GApp exercise was graded.Relevant data from open response questions from these students are as follows: "I enjoy the challenge of grades.Ungraded (application exercises) are too 'low key.' People tend to slack off more and ignore what is going in discussion.There is lots of irrelevant chatter." "Without grading, GApp exercises may as well be a part of normal class time.People don't try as hard.My group barely even cares to pay attention." While these students perceived an increase in effort with graded GApp exercises, knowledge outcomes depend on combined learning through IRAT, GRAT, and GApp exercises, and our IRAT data indicates that an individual's mastery of the advance assignment is independent of whether GAPs are graded.Moreover, these students represent only a minority of those polled and 'typical' student perceptions do not indicate decreased effort when GApp exercises are not graded.Future studies will be necessary to quantify any potential reduction of effort.
A limitation of this study is that responses are from two classes of medical students from one institution with exposure to TBL during both the first and second year of their curriculum.The extensive exposure to TBL among our study population may lend this particular group of students to focus less on grades and more on learning.Results of this study should be interpreted in light of this and therefore may not be representative of all medical students.While much effort was taken to create a valid survey, expert reviewers were from the study institution only and outside review was not utilized.An additional limitation is that, except for the IRAT scores, our reported outcomes are largely based on students' subjective impressions of their experience, as opposed to a more objective evaluation of their learning.That said, no marked trends in overall student performance during second year were noted when graded versus ungraded GApp exercises were used.In light of our findings, future studies should investigate the effectiveness of TBL modules when various portions of the module are ungraded.
In conclusion, medical students perceive a reduction in stress, an improved learning environment, and higher quality group discussion when GApp exercises are ungraded without sacrificing outcomes in knowledge acquisition or the development of teamwork skills.Students perceive TBL as equally effective regardless of GApp exercises grade weight, but prefer ungraded GApp exercises.Extrinsic incentives (i.e., grades) may not be necessary to ensure participation in active learning strategies and have the adverse effect of unnecessarily increasing stress levels.The use of ungraded GApps appears, therefore, to be a successful modification of TBL, making it more "studentfriendly" while maintaining the goals of active learning and development of teamwork skills among medical students.

Table 1 .
Percentage of responses and chi-square for graded, ungraded, or both graded and ungraded Group Application exercises (n = 175) Educ Eval Health Prof 2014, 11: 6 • http://dx.doi.org/10.3352/jeehp.2014.11.6 Domain 5: perceived effort 16) I put forth the most effort during team-based learning (TBL) is 17) I put forth the most effort preparing for TBL is 18) I am more likely to skip that day' s lecture periods to study for the TBL module is http://jeehp.orgJ

Table 2 .
Chi-square analysis of responses to domain 4 (inter-team discussion) from students selecting 'ungraded' for survey item 4 (n = 140)

Table 3 .
Chi-square analysis of responses to domain 5 (perceived effort) from only students selecting 'ungraded' for survey item 4 (n = 12)

Table 4 .
Individual Readiness Assurance Tests (IRAT) grades in TBL modules with graded vs. ungraded Group Application exercises Data analyzed from team-based learning (TBL) modules for which identical Readiness Assurance Tests (RATs) were administered for both years (n = 20); RATs that were not identical were excluded from analysis.No significant difference (P = 0.76, t-test) was found in mean IRAT grades between 2008-2009 and 2009-2010 academic years.

Table 7 .
Responses to question 3: "Which component(s) of team-based learning help you with remembering course material?"Responses categorized during post hoc analysis.Data are presented as percentage.Because students were free to list several team-based learning (TBL) components, the sum of percentages may be > 100%.Data in right three columns organized by preference stated in open response question 1. Educ Eval Health Prof 2014, 11: 6 • http://dx.doi.org/10.3352/jeehp.2014.11.6 http://jeehp.orgJ