Does using active learning in thermodynamics lectures improve students’ conceptual understanding and learning experiences?

Encouraging ‘active learning’ in the large lecture theatre emerges as a credible recommendation for improving university courses, with reports often showing significant improvements in learning outcomes. However, the recommendations are based predominantly on studies undertaken in mechanics. We set out to examine those claims in the thermodynamics module of a large first year physics course with an established technique, called interactive lecture demonstrations (ILDs). The study took place at The University of Sydney, where four parallel streams of the thermodynamics module were divided into two streams that experienced the ILDs and two streams that did not. The programme was first implemented in 2011 to gain experience and refine logistical matters and repeated in 2012 with approximately 500 students. A validated survey, the thermal concepts survey, was used as pre-test and post-test to measure learning gains while surveys and interviews provided insights into what the ‘active learning’ meant from student experiences. We analysed lecture recordings to capture the time devoted to different activities in a lecture, including interactivity. The learning gains were in the ‘high gain’ range for the ILD streams and ‘medium gain’ for the other streams. The analysis of the lecture recordings showed that the ILD streams devoted significantly more time to interactivity while surveys and interviews showed that students in the ILD streams were thinking in deep ways. Our study shows that ILDs can make a difference in students’ conceptual understanding as well as their experiences, demonstrating the potential value-add that can be provided by investing in active learning to enhance lectures.


Introduction
The objective of instructors and educational researchers is to find ways in which student learning may be improved. In the context of large first year science courses, the educational literature is consistent in recommending the facilitation of 'active learning' (Vernon and Blake 1993). The evidence in support of active learning is so convincing that prominent researchers are now calling for 'second generation' research: the how and what of active learning implementations (Freeman et al 2014) along the lines described in this paper.
Active learning is the increasing of student participation, or 'interactivity', for the purpose of positively affecting student learning and attitudes. The instructional method used to facilitate interactivity is commonly known as 'interactive engagement' (Hake 1998, Meltzer andManivannan 2002). Increased interactivity in lectures can be introduced using a range of techniques such as demonstrations (Crouch et al 2004); personal response systems, or 'clickers' (Draper and Brown 2004, Sharma et al 2005, Beuckman et al 2007, Keller et al 2007, MacArthur and Jones 2008, Willoughby and Gustafson 2009; peer instruction (Mazur 2001, Lasry et al 2008; interactive lecture demonstrations (ILDs) Thornton 1997, Sharma et al 2010). The latter is the focus of this study.
The ILDs are a highly structured and comprehensive active learning technique. Each ILD involves a carefully selected demonstration involving an experiment or a series of experiments that can be done in a lecture theatre. Real time data are gathered, generally using a computer, and results displayed to all students. The interactivity is enacted through an eightstep procedure listed in table 1. Where there is more than one experiment in an ILD, the eightstep procedure is repeated for each experiment.
The eight-steps can be used or adapted for use with other experiments with the intent of increasing interactivity. However, the specific series of ILDs utilized in this study are based on key robust student alternative conceptions that are not congruent with scientific understandings. Discipline based educational studies have found that shifting students understandings in these key conceptual areas is particularly effective. In other words, improving students' understanding is not simply related to increasing interactivity but is also subject matter dependent, since it involves addressing specific alternative conceptions (Andrews et al 2011) and engaging with context (Buncick et al 2001). Consequently, it is important to demonstrate the utility of active learning in different topics and in different contexts.
Our study incorporates ILDs into thermodynamics, a topic that has not been explored to the extent to which topics like mechanics and electricity have been in terms of student learning and pedagogies. Thermodynamics has particular concepts that are robust and difficult to shift, lending them to be developed as ILDs (Georgiou and Sharma 2010). The purpose of this study was to investigate the following research questions in the topic of thermodynamics with a large first year undergraduate cohort.
1. Is the conceptual understanding of students who engage in ILDs as measured by a conceptual survey higher than those who do not engage with ILDs? 2. What do lectures with ILDs look like in comparison to those that do not have ILDs? 3. What are student experiences of ILDs?
The ILDs were incorporated over two years, 2011 and 2012. The first year of implementation gave lecturers experience and confidence in implementing the ILDs, provided the opportunity to refine logistical matters and informed the second year of study. The second year of the study sought to systematically implement active learning in order to address the stated research questions.

The context
The study occurred within the ten-lecture thermodynamics module of a large first-year calculus-based mainstream physics course-the regular course. The course has four concurrent streams and three modules: mechanics, thermodynamics, and waves. Module content is provided in a detailed syllabus and is common between the concurrent streams, as is the assessment. However, the lecturers have freedom to deliver lectures as they wish, thus providing the opportunity to implement ILDs in two streams of the thermodynamics module and leaving the two remaining streams unaltered. Lecture notes and recorded lectures (Lectopia) are available on the online e-learning site (Blackboard).
The regular course occurs over 13 teaching weeks and consists of three one-hour lectures and one one-hour tutorial per week, and eight three-hour laboratory sessions over the semester. The textbook used is Young and Freedman's University Physics 12th edn (Young and Freedman 2007). The ten-lecture thermodynamics module covers temperature, heat, mechanisms of heat transfer, thermodynamic processes, heat engines and entropy. 1. The instructor describes the demonstration and does it for the class without measurements displayed. 2. The students are asked to record their individual predictions on a Prediction Sheet, which is collected at the end of the session, and which can be identified by each student's name written at the top. (The students are assured that these predictions will not be graded.) 3. The students engage in small group discussions with their one or two nearest neighbours. 4. The instructor elicits common student predictions from the whole class. 5. The students record their final predictions on the Prediction Sheet. 6. The instructor carries out the demonstration with measurements (usually graphs collected with micro-computer-based laboratory tools) displayed on a suitable display (multiple monitors, LCD, or computer projector). 7. A few students describe the results and discuss them in the context of the demonstration. Students may fill out a Results Sheet, identical to the Prediction Sheet, which they may take with them for further study. 8. Students (or the instructor) discuss analogous physical situation(s) with different 'surface' features. (That is, different physical situation(s) based on the same concept(s).) The course assessment is by assignments (10%), tutorial attendance (2%), laboratory work (20%), an in-lab test (5%) and a final three hour examination (63%). The tutorials and laboratory sessions encourage collaborative learning. The final examination consists of short and long response qualitative and quantitative questions (no multiple-choice).

Instrument and analysis
To answer the first research question, the thermal concepts survey (TCS), validated with more than 2000 students, was used to measure conceptual understanding (Wattanakasiwich et al 2013). The survey can be used in two parts with the first 16 questions covering 'intuitive' concepts and concepts encountered as part of the school curriculum in the state of New South Wales in Australia: temperature, heat transfer and ideal gas law. Part 2 covers the first law of thermodynamics, thermodynamic processes and engines, subject matter not covered in senior high school and unfamiliar to the vast majority of first year students. Part 1, the first 16 questions, was therefore used to measure gains by acting as pre-and post-test for this study. Learning gains were calculated for students who completed three or four of the ILDs or submitted three of the four exercises as well completing both the pre-and post-tests, using the measure of 'normalized gain' (Hake 1998). Effect size, d, was calculation was based on Minium et al (1993, pp 364-6).
To answer the second research question, ILD lectures were compared with normal lectures. To capture and examine what occurred in lectures, we analysed lecture recordings and coded according to the criteria shown below.
• Administration: lecturer is providing information or answering questions about scheduling, assessment etc. Lecturer may or may not seek feedback from students. • Demonstrations: lecturer is showing scientific equipment and explaining or relating to physical principles. No feedback from students is solicited. • Interaction: lecturer is seeking feedback through small groups or a whole cohort discussion. ILDs fall into this category. • Transmission: lecturer is telling and explaining lecture slides or writing on the chalkboard. No feedback from students is solicited. • Exercises: lecturer deploys a worksheet with or without feedback and discussions.
• Deadtime: lecturer allows time for settling students or organizing material. This category was included so that the total lecture time for all lectures totalled 60 min and could be represented as a proportion of the lecture hour.
To answer the third research question, a survey was administered to students who experienced the ILDs, streams 1 and 2. The survey had 12 Likert scale items (from strongly disagree, neutral to strongly agree) designed to gather feedback on various aspects of the student experience of learning with ILDs and six such questions intended to elicit feedback on specific ILDs. The percentages of students strongly agreeing and agreeing were combined to provide the percentage of students in overall agreement while those strongly disagreeing and disagreeing were combined to provide overall disagreement. There were two free response questions. The first of these asked students to state their favourite ILD and provide a reason for why this was so and the second asked for their least favourite ILD and again asked for a reason for why this was so.
Since the survey had been administered at the end of the last lecture with the post-test, the expectation had been that students would be brief when responding to the free response questions. Hence the responses would not lend themselves to elaborate qualitative coding. Instead, a simple categorization was employed. Students provided none, one, or many reasons for why an ILD was their favourite. Reasons that exhibited similarities were placed into categories. Descriptions were developed for each such category and these were refined iteratively. The process was repeated for the least favourite ILD. Interestingly, there were parallels in the emergent categories for favourite and least favourite ILD as well as with the data from the Likert scale items. This allowed comparisons across the breadth of the survey, integrating the results into emergent themes. Six semi-structured interviews were also conducted to triangulate survey results. This triangulation provided several themes which help explore details of the ILD technique and active learning.

Participants
The three lecturers coded as staff-1 to staff-3 are described in table 2. Staff-1, an expert in the physics education field, implemented ILDs, while staff-3, an established academic presented their lectures as normal in both years. Staff-2 an early career academic was trained in using the ILDs in 2011 and was comfortable in independently implementing ILDs in 2012. Staff-2 had a second stream, in which they lectured as normal in both years.
In 2012, approximately 500 students were enrolled in regular physics. The students would have completed a physics course in senior high school. The students have university entrance rankings under the Australian Tertiary Admission Rank (ATAR) scheme, in the range 85-95, making this a relatively homogenous group.

The ILDs and normal lectures
The ILDs involve a specific sequence of carefully selected experiments-an introduction to the equipment, an introduction to the problem, time for students to make a prediction on a hand-in worksheet individually and then confer with peers, the soliciting and writing of some predictions on the board by the lecturer, performing of the experiment in real time with collection and presentation of data via the data projector and finally, a whole class discussion of the results in the context of predictions noted earlier on the board (see table 1 and Sokoloff and Thornton 1997). The students are handed two identical worksheets (of different colours): one to write their predictions and hand in, and the other to record outcomes and take home. The hand-in worksheets also provide a mechanism for tracking student participation. The hallmark of this entire process is an on-going discourse in which the lecturer is required to adopt a particular sequence of class interactions (shown in table 1) that facilitate active learning. The ILDs consume a significant portion of the time requiring lecturers to adjust lecture content.
There were a total of six ILDs, covered in four lectures. ILD1 was on heat and temperature, ILD2 on specific heat capacity, ILD3a (pee-pee boy) on the first law, ILD3b on isobaric processes, ILD3c (fog in a bottle) on adiabatic processes, and ILD4 on heat engines. Each separate ILD may feature a number of experiments and therefore consumed different amounts of class time. In the absence of hand-in sheets for the non-ILD streams, students were asked to complete and hand in a short pen and paper exercise for the purpose of recording participation. Lecturers allocated around 5 min at the beginning of the lecture to complete the exercise and students could complete these exercises in groups or individually. The lectures in these streams (3 and 4) then continued as normal.

Procedure
All lecture streams covered the same content and in roughly the same sequence. The ILDs and the short exercises mentioned above occurred in four of the ten lectures: lectures 1, 3, 7 and 9. Apart from the implementation of each ILD or exercise, the lecturer had the freedom afforded by the system to lecture in their natural manner. Recorded lectures (available to students) were downloaded and analysed using the criteria described earlier.
The TCS was administered in the first laboratory session as the pre-test and in the final lecture as the post-test. The ILD and exercise worksheets were collected with student identification in lectures 1, 3, 7 and 9. Surveys were also deployed in the ILD streams in the last lecture to gather students' experiences.
Six half an hour interviews were run in the next semester, transcribed and coded. Students were directly requested by the researcher in the form of an announcement made during tutorials to volunteer for the interviews.
The numbers of students who completed the different components and met the criteria for the particular analysis for each research question varies. Hence, the numbers of students are quoted progressively as we present the results. Table 3 shows the pre-test scores, post-test scores and normalized gains for part 1 of the TCS for students who completed three or four of the ILDs or submitted three of the four exercises.

Research question 1: learning gains
We note two points from these data. First there is an overall improvement exhibited by gains in the range of 0.27-0.40, and an average of 0.31 which is considered medium to high when considering existing data (Hake 1998. Second, the streams with ILDs exhibit higher gains as indicated by the normalized gain measure and the effect size, d, as calculated from the comparisons between highest and lowest means (see Minium et al 1993, pp 364-6) which is 0.40 and indicates a noteworthy effect. The effect size of 0.40 falls in the medium range which is amongst the best for educational studies in authentic settings.  Figure 1 illustrates the percentage of time devoted to activities within each of the categories obtained by analysing the lecture recordings for the seven lectures that were audibly recorded. We note that the amount of interactivity in the least interactive set of ILD lectures, stream 2, is still some 60% more than the most interactive normal lecture, stream 4. Comparing the difference for the same lecturer who was involved in streams 2 and 3, the highly structured ILDs in four lectures resulted in a marked increase in interactivity. The difference between the two ILD streams can be ascribed to a combination of two factors: first, interactivity (such as clickers, see for example Liu and Taylor 2013) that was not part of the ILD procedure utilised by lecturers and second, more time devoted to interaction during the prescribed ILD procedures, namely steps 3, 7 and 8 in table 1.

Research question 3: what do students have to say?
3.3.1. Did the ILDs help understand specific physics ideas? We start of by considering student responses to the survey items on specific ILDs, as well as the votes received for most favourite and least favourite, see table 4. Around two thirds of the students were in overall agreement that the ILDs did help them understand specific ideas better. ILD3a, (described in Wattanakasiwich et al 2012), was by far the most popular favourite with 62 votes while ILD4 the least favourite with 38. A total of 27 students did not vote for their favourite ILD while 50 did not vote for their least favourite. In the free responses, a total of 116 students commented providing 143 reasons for their favourite ILD while 91 students commented providing 92 reasons for their least favourite. It is noteworthy that fewer students chose a least favourite ILD, and did not provide a reason. The differences in the numbers possibly indicate that student experiences were more positive than negative. The predominant reason for the popularity of ILD3a was that it was fun and interesting. Other ILDs attracted this reason but to a much lesser extent. No particular reason stood out for ILD4 receiving the most votes for least favourite. ILD4 at 23 votes was also the second most popular. This points to the diverse ways in which students engage within learning environments. There were several miscellaneous negative and positive comments which are also shown in table 4. The other free response categories and survey items are discussed below according to their themes.
3.3.2. Clarity of physics: did the ILDs make the physics clearer so it would be easier to understand? Four items from the Likert scale, shown in table 5 correspond to this theme. More than 70% of the students are in overall agreement that ILDs support lectures and help in understanding concepts. In the free responses, 18 students commented that a reason for voting an ILD as favourite was because it helped with understanding the physics, while a reason provided by 21 students for selecting an ILD as least favourite was because it did not do so. Consequently it is important that ILDs clarify the content and help with understanding concepts. Sometimes demonstrations in lectures can be just for fun. However, it has been noted that, often, such demonstrations do not result in better understanding (Crouch et al 2004). Since ILDs take time and require the students to invest as part of active learning, they need to be well connected to lecture content and help with understanding concepts.  6, correspond to this theme. Some 65% of the students are in overall agreement that ILDs provide opportunities for scientific reasoning while 81% indicated that ILDs were challenging for learning. Of particular note is that 73% of students were in overall agreement that the predictions helped realize their misconceptions. The term 'misconception', rather than alternative conceptions was selected as it is comprehensible to students. This  Has evoked a curiosity in me Bit confusing but helpful It gave me a mental picture I had conflicting reasoning to correct Showed my misconceptions most vividly Very basic, not much change, and no misconceptions notion of misconceptions is picked up in the free responses with students commenting that they voted particular ILDs as favourite because it showed their misconceptions. A comment for why an ILD was least favourite was because the student felt that it did not address any misconceptions. The confusion arising during the process of changing one's misconceptions is also indicated as a reason why an ILD could be voted least favourite. In total, 26 and 27 student comments fell into this category. Consequently, in order to shift persistent student understandings that are not congruent with scientific understandings, it is important that ILDs address alternative conceptions even if students indicate being 'confused' and 'conflicting reasons' as not favouring their experiences of ILDs.
3.3.4. Technique: were the techniques associated with the ILDs appropriate? Four items from the Likert scale, shown in table 7, correspond to this theme. Some 65% of the students are in overall agreement with three of the statements. However, around half of the students were in overall agreement with the statement that they had the opportunity to discuss their opinions with the lecturer. While this can be of concern, the fact that half felt they had this opportunity in a large lecture class is actually pretty good. There were 41 free responses positively referring to technique-related matters associated with their least favourite ILD. This suggests that students recognize and appreciate high quality technique. The interplay between the themes described above crystalized in the interviews. During the interviews, students were able to elaborate on why or if the ILDs were different to 'regular' demonstrations.
Um… the ones that … well definitely the ones that made you think of course were a lot more helpful because they develop your sense of understanding and you think of a question but the ones that they just show at the front they were also really good because its … sometimes it was fun so it is a good way to remember a concept and yeah … so like... both were pretty good but in terms of probably learning … the ones that they gave us … that we had to make predictions were probably more useful.
They are (regular demonstrations) probably not really directly like not really um … a method for teaching but really it's the questions that we kind of think about why things are the way they are that makes us … helps us relate a concept we have learned but there's still quite a distinction between contextually seeing things and seeing things. They (ILDs/demonstrations) do that different levels of impact.
Yeah and I think that's um … actually really good because it means everyone is really focused and um … yeah … it is um, a lot easier to concentrate that way and follow what's going on and stuff.
One student also highlighted the risk of losing interest because of the freedom to speak with peers that the ILDs allow: Um, just if we are given too much time to ourselves to think about things or too much time to discuss with our neighbour and that sort of thing it is sort of off putting a little bit, it makes us drift and talk about other things.
Students also made comments indicating a desire to interact. When asked what a lecturer could do to guarantee or facilitate student engagement, remarks were as follows: Well if he is doing an experiment he'll ask us like 'why it happens like that' and 'can you explain that' and that forces us to think about it rather than just explaining to us … I think it is important for the lecturer to engage the student and not just keep going through the things on their own … not engaging your student and just talking by yourself they will never learn anything … I spoke to other students and they feel the same way …

Discussion and conclusion
Interactivity (active learning) is commonly presented as a credible solution to the problem of a reported lack of efficacy from 'traditional' lectures (Cummings 2013). However, there is still a considerable amount of uncertainty around just how successful the active learning agenda actually is (Prince 2004) and calls for understanding what the 'interactive learning' process entails (Freeman et al 2014). We are part of the community of healthy sceptics of active learning programmes investigating active learning in more detail. We argue for diverse implementations of active learning in different topics, different countries and different local institutional contexts. Our study is one such effort to share our experiences with an active learning programme in a large first year thermodynamics module; ILDs in two streams while two other streams proceeded as normal. The ILDs are highly prescriptive meaning that most reasonable attempts at implementation will have sufficient chances of incorporating a fair Year one was a pilot to familiarize the lecturers with the techniques. Systematic data was collected in the second year. The first research question examined the learning gains using a validated conceptual survey. The data showed medium gains in streams that proceeded as normal and high gains in the ILD streams. This is in line with other studies in mechanics, such as Sharma et al (2010). The next research question probed what happens in the lectures in an effort to examine the differences between the ILD and other streams. Some noteworthy, and expected, differences were uncovered between the ILDs and the normal lectures. Analysis showed that as expected, ILDs students had more time to discuss with their peers as articulated in table 1.
The last research question focused on what the students experienced and their perceptions. The major difference was in what the students were doing, affirming the findings from the lecture recordings. Students had 'more opportunities' to discuss with peers and the instructor. They also felt that they were understanding concepts, their learning and views were being challenged and they appreciated aspects of the ILD technique. In interviews, students further explicitly noted that they prefer to be engaged, and according to students, committing to predictions or completing ILDs was successful in achieving this.
The carefully crafted ILDs create an environment in which the lecturer and the students take 'time-out' to do different things as advocated for active learning. Our study shows that student learning and experiences benefited from the implementation of the highly structured ILDs. The notion of active learning has been around for some time but there is limited research reports detailing implementation and results in different topics, different countries and contexts. From this perspective, active learning is in its infancy, there is a need to pursue the development of associated techniques and ensure the dissemination of its successes and failures.