Educational Explorations of Chemical Kinetics in a Problem Based Learning Context

The effectiveness of Problem Based learning in the context of chemical education has been tested by using different sets of computer based simulations as teaching aid over a two-year period in a major college of the Southern United States of America. By running simulations of a simple reaction’s system on a JAVA based simulator, students were then able to earn a substantial improvement in their test scores. Students’ improvements, measured with a four-group Solomon test composed of multiple choice questions, were similar or better of the improvements of students that followed the traditional teaching and learning study path. This study underlines the fact that PBL can be effectively used in a college classroom to complement the textbook and to raise awareness amongst students of the time and efforts needed to achieve measurable progresses in a simulated research environment.


INTRODUCTION
Problem Based Learning (PBL) is a learner centered teaching approach inspired to constructivist theories designed to empower learners to integrate theory and practice by challenging them with a problem [1].Initially introduced for medical students testing their understanding of medical theories acquired in lecture classes, the PBL approach quickly spread virtually throughout the entire undergraduate curriculum [2] including chemistry and chemical engineering [3][4][5].While there is no formal definition of PBL, Boud and Feletti [6] identify several features of PBL including the fact that it uses stimulus material to help students mature a solution to an important topic, it presents a problem as a simulation of real life practice, it requires guidance, it promotes cooperative learning, and it helps learners to identify their own lack of knowledge and to tweak their learning process.
Chemical kinetics is the study of chemical reactions and how they proceed over time [7].Chemical kinetics is a central aspect of chemistry that can be used to unveil the elementary processes of a reaction's mechanism.Unlikely other aspects of chemistry, chemical kinetics makes heavy use of differential equations and it requires at least a basic understanding of complex mathematical concepts.Because of that, usually chemical kinetics is incorporated in its most elementary form in second and third year undergraduate classes, while the bulk of the discipline is reserved for most advanced graduate courses [8].On the other hand, as pointed out by Van Berkel [9] when college students cannot grasp the process behind a chemical phenomenon, chemistry as a discipline is very likely to incur the risk of becoming just a mnemonic exercise.
Justi and Gilbert [10][11][12] extensively reviewed the different aspects of the use of the practice of teaching science in general and chemistry in particular.In the specific Justi [13,14] focuses her attention on discussing the use of models in chemical education.Justi [13] identifies eight different conceptual framework (models) used in the teaching and learning of chemical kinetics, namely the anthropomorphic model, the affinity corpuscular model, the first quantitative model, the mechanism model, the thermodynamics model, the kinetic model, the statistical mechanics model, and the transition state model.The latter one being characterized, amongst other things, by the use of differential equations, by the rates being calculated in relationship with the formation and decomposition of the activated complex, and by addressing the complexity of reaction's mechanism.These are features that are consistent with the conceptual framework of the present investigation that attempts to bring down to the students' level the complexities of a chemical reaction.
Historically, the Lindemann theory was first introduced in 1921 by Fredrick Lindemann and later developed by him together with Cyril Hinshelwood [7].The name Lindemann-Hinshelwood theory in fact derives from the cooperation of the two British investigators.The theory was designed to explain the gas phase associative reaction mechanism.According to the Lindemann-Hinshelwood theory, when reagents A and B collide together they form an energized adduct that can either be stabilized, thereby developing into the formation of the reaction's products, or can decompose due to its inability to dissipate the extra kinetic energy released by the formation of the new bonds.In the Lindemann-Hinshelwood theory, the mechanism for reaction (1) can be broken down into the three simple steps below: where M is a third body, usually the bath gas present in great excess in the reaction chamber that collides with the adduct, thereby carrying away the extra energy and finally stabilizing the adduct into the reaction products, and the symbol * refers to an energized state, either for the adduct or the third body.
Ultimately there are more ramifications departing from the Lindemann-Hinshelwood , the conformation of the energized state being one, the nature of the third body being another one, for example.The bottom line is that these aspects would be too complicated to tackle in an undergraduate class, and would rather be material for further study in an advance course.The research question inferred by this equation's system is: "Is it possible to build a computer based simulation that accurately depict the simple Lindemann-Hinshelwood theory and that can be used to effectively teach students chemical kinetics?"Reconciling the hystorical perspective provided by Justi [13,14] with the dramatic actuality of the classroom one can affirm that simulations do, in fact, model reality with different degrees of accuracy [15].With specific reference to the theories expressed in the book by Maggie Renken [15], computer based simulations would fit in between those models that dynamically mimic reality and the one purely based on an algorithm.The algorithm in this case, is not a complicated one, rather it is the system of differential equations relating thee rate of reaction with the elementary processes (1), (2), and Since this is a quantitative investigation a hypothesis (H 1 ) must be defined to properly evaluate the effectiveness of the teaching.The hypothesis can be simply written by restating the research question as a positive statement.Hypothesis (H 1 ): Is it possible to build a computer based simulation that accurately depict the simple Lindemann-Hinshelwood theory and that can be used to effectively teach students chemical kinetics.In other words, by allowing students to simulate, under a guided learning approach, a chemical reaction the test scores of their evaluations should be equal or better than the test scores earned by learning in the traditional lecture and book method.
The assumption behind that is that the theoretical aspects of reaction are described by a model built on the basis of the Lindemann-Hinshelwood theory.The aim of this experiment is that by simulating a simple chemical reaction the students should be more likely to understand the different aspects of the elementary steps involved in the reaction and as a consequence, they should be able to perform better or earn higher scores, when they are tested on this specific topic.
In antithesis to what explained above, the null hypothesis (H o ) predicts that there is no significant difference between the two instructional approaches.In other words, the null hypothesis would be true if, after performing computer based simulation of a simple chemical reaction, for a period of time deemed sufficient to learn the topic under normal teaching and learning conditions, the students would not achieve adequate scores on a quantitative evaluation.While the topic is not one of the easiest ones, and it is often reserved for advanced graduate classes [8], there are several reasons to speculate that this approach may actually lead to a positive outcome.It has been reported by several authors in fact, that the generation born in the years between 1981 and 1999 (millennials) tent to favour a computer based and selfdirected approach to teaching and learning over other educational techniques [16,17].Aside than that the availability of electronic machines is now more abundant than ever, computers being present in almost any college library in the world [18].
Another positive element to consider in support of this study is the general orientation in the science educator community toward PBL [1,15,19] where students' abilities are literally used as scaffold to support their educational exploration.

MATERIAL AND METHODS
To answer the research question, a sample of 104 students attending second year chemistry classes at a major Southern US college between 2015 and 2017 were asked to participate in a Solomon four groups study focused on their understanding of chemical kinetics.The study was submitted and approved by the Institutional Research Board (IRB) of the college before implementation.Participation in the study was strictly voluntarily and participants had to sign an informed consent approved by the college IRB explaining to them the importance of their participation in the study and the potential risks associated with it.Out of the original 104 students selected, 66 actively participated engaging in meetings, returning assignments timely, and ultimately taking both pre-test and post-tests, thereby setting a participation rate of about 64%.The Solomon four groups test is discussed elsewhere [20,21] and it represent a way of ruling out any possible effects of the pre-test (if any) on the final results.
The 104 students sample is a convenience sample in the sense that is was selected based on the availability of second and third year classes covering chemical kinetics and the willingness of the respective faculty to participate in the study.In that sense no randomization occurred, while the selection of the students within the sample to join any specific group was randomized.The sample of students was subdivided in four nonequivalent groups.The first group, an experiment group consisting of 29 students underwent a pre-test (O 1 ), then was exposed to the guided learning approach (X), then was tested again on a post-test (O 2 ).The pre-test was a 20 multiple choice questions (MCQ) test on chemical kinetics designed after the college's course competencies and learning objectives.The pre-test was designed to ascertain any prior knowledge of the students on the subject matter.The guided teaching extended over a two-weeks period of time and it consisted in a one hour introductory lecture, a detailed instruction sheet on how to perform chemical simulations with a simulator of choice, a one week study period, during which the instructor would be available for consultation remotely, a one hour face to face troubleshooting meeting, a second week of independent study, and a final summary face to face hour to debrief.The troubleshooting meeting was used to verify the progresses of the students in the use of the simulator.By the contacts made with the instructor during the two weeks period for which the experience was run was also possible to infer whether the students were actually learning to use the simulator, thus generating questions and concerns, or not.One of the most common concerns of the students performing the simulation was voiced as: "how do we know the outcome of our simulations is correct?"The answer to this question often required consultation of reference material, either the textbook, or open sources of information.The importance of having reference material therefore was underscored by the present study, that does not deny the relevance of current teaching practices, rather is focused in providing a more engaging experience for students.The post test, constituted by a battery of 35 MCQ with only 10 repeaters from the pre-test, was designed to take place right after the de briefing.
The simulator suggested for the study was TENUA [22] a Java based simulator widely accepted in the kinetic community that is relatively easy to use.An important factor in choosing TENUA as the simulator for this study is that that, while retaining the rigor needed for an academic study, it lowers the skills barrier needed to efficiently run it.The chemical system chosen for the simulation is the well-known three-body recombination reaction of nitrogen monoxide with the hydroxyl radical [23,24].The three-body recombination reaction of nitrogen monoxide with the hydroxyl radical may lead at least four different outcomes, although energetic studies show that the formation of nitrous acid, both in its cis and trans isomeric configurations, is the reaction path featuring the lowest energetic barrier, and therefore it is the main channel for this reaction [24].In addition to that, experimental studies show that the three-body recombination reaction is a highly predictable system featuring an absolute rate coefficient under a pressure of 50 Torr of Helium of k II =1.4*10 -12 [cm 3 molecule -1 s -1 ] [23].
A second nonequivalent group of 10 students was exposed to the same pre-test of the first nonequivalent group (O 3 ), but then instead of being exposed to the material developed to guide them through the computer simulation of chemical system, they were exposed to the same classroom material that was supposed to be taught in the lecture class [25], with some additional reading of specialized literature [26].This group was referred as to the first control group (C).After two weeks of studying the control group progress were tested with the same battery of 35 multiple choice questions of the experiment group (O 4 ).Both the first experiment group and the first control group underwent a pre-test and a post-test.
A third nonequivalent group of 20 students was also exposed to the guided learning approach (X), and to the 35 MCQ post-test (O 5 ), but was not exposed to the pre-test.Finally, a fourth nonequivalent group of 9 students, representing the control (C), was exposed to the same references of the first control group [25,26] and was allowed the same time to study and get familiar with the subject matter before undergoing a post-test (O 6 ).This fourth nonequivalent group was not pre-tested.Braver and Braver [21] describe in detail how to analyze the results of a Solomon four group study.The first step is directed to ascertain if there is any measurable difference between the groups that were exposed to the pre-test and the groups that were not exposed to the pre-test.If there is a statistically significant difference, then it is assumed that there is a pre-test effect and the analysis of the groups is done separately.On the other hand, if there is no statistically significant difference, it can be concluded that the pre-test has no effect and the experiments group and the control groups respectively can be treated as homogeneous groups.This analysis is done by performing an analysis of the variance (ANOVA) [27] on all four groups assuming a difference in excess of 5% being unacceptable.
After the potential effect of the pre-test is examined there are three possible outcomes for this research design.The first one is that if the experiment groups perform substantially better (P ≤ .05)than the control groups, then the null hypothesis (H o ) is rejected the hypothesis (H 1 ) must be accepted.If both the control group and the experiment group perform equally there is no difference and the null hypothesis H o could be held true, basically there would be no difference between the guided learning approach of the present investigation and the current standard practice of reading and learning from textbooks.If in fact the control group perform better (P≤ .05)than the experiment group, then the traditional teaching and learning approach stands its ground, and at least the methodology of the present investigation does not improve over it.

RESULTS AND DISCUSSION
ANOVA was used to investigate the differences between the results of the distributions of the pre-tests (O 1 +O 3 ) with the distribution of the posttests results for the control distribution (O 4 +O 6 ) and for the experimental distribution (O 2 +O 5 ).ANOVA is a statistical tool designed to analyze statistically meaningful differences in different groups of data.There are three assumptions on which the analysis of the variance is based: (1) the populations have same or similar variance, (2) the populations are normally distributed, and (3) each data point is independent from the other ones.The three datasets compared in this study met these three criteria: therefore, the application of the analysis of the variance was justified.Since the post-test comprised 35 MCQs and the pre-test only comprised 20in order to perform an ANOVA analysis, the data for all the distributions were normalized to a percentage scale.The null hypothesis (H oa ) states that there is no difference between the different sets of data.The alternate hypothesis (H 1a ) states that there must be a statistically significant difference between the three sets of data.When performing the analysis of the variance (ANOVA) the distance between and within groups (SS) is calculated as the sum of the variables' squares.The degrees of freedom (df) is then obtained respectively as the number of groups -1, and as the number of data points (for each group) -1: where 1 represents the relationship that links respectively the groups and the data points.In this particular application, three groups of students are compared, therefore the degrees of freedom (df) between the groups is 3-1=2.On the other hand, being a total 115 independent data points considered, grouped in three separate groups, the degrees of freedom within the groups is 115-3=112.The means square (MS) are respectively calculated by dividing the sum of the squares (SS) between and within the groups by the respective degrees of freedom (df).Finally, the F ratio is obtained by dividing the means square between the groups by the means square within the groups.F values were tabulated by Sir Ronald Fisher, a statistician from the United Kingdom [28] in a way that it could be possible to compare the output of this sequence of calculation against known results, making quicker the process of inferring the outcome of the hypothesis testing.Running the ANOVA analysis with the excel tool pack the output shown in Table 2 is obtained.In this preliminary analysis, the subscript character a close by the hypothesis symbol H stands to signify that that particular hypothesis is only referred to the preliminary ANOVA analysis.
The ANOVA Output shows a critical factor F (= 3.077) < calculated F (=38.54) therefore the null hypothesis (H oa ) that there is no difference between the three distributions is rejected and the alternate hypothesis (H 1a ) is true.To ascertain where is the difference between the three groups compared, the protocol established by Braver and Braver [21] suggests to perform individual t-student tests between the different groups.By performing three supplemental t-test in between each distribution it becomes evident that the significant differences are between the distribution of the pre-test (O 1 +O 3 ) results and the distribution of the post-tests results for the control distribution (O 4 +O 6 ) and for the experimental distribution (O 2 +O 5 ).distributions.In the specific, the nonequivalent group of students subject to the pre-test includes n 1 =29 individuals, the mean of their post-test scores is 21.86 and it is associated with a standard deviation of 4.25.
On the other hand, the group of students that was not subject to the pre-test includes n 3 =20 individuals and the mean of their post-test scores is 22.30 and it is associated with a standard deviation of 4.15.
Intuitively, one would expect the opposite, in other words under the assumptions of the Solomon's four groups test design the mean of the post-test scores of the group that was not subject to the pre-test would be expected to be lower than the mean of the post-test scores of the group that was in fact subject to the pretest.In this specific case, the opposite result is found and, therefore, by looking at the data one would expect that the difference is zero or minimal.This can be explained, for example, by considering that the original Solomon's reference [20], while promoting awareness of the possible effects of the pre-test in the field of social research, it was directed to a substantially different context.The value of the t-test calculated for this first distribution was t=0.36, substantially below the critical value of t=2.007 obtained from statistical tables: this confirmed that there was no statistically significant difference between the experimental nonequivalent group subject to the pre-test and the experimental nonequivalent group not subject to the pre-test.
The analysis of the comparison between the values of the post-test scores of the nonequivalent group exposed to the pre-test and the values of the post-test scores of the nonequivalent group that was not exposed to the pre-test was also extended to the control group.The number of students of the control group that was subject to the pre-test was n 2 =10, the mean of their post-test scores was 21.2 and it is associated with a standard deviation of 2.52.The number of students of the control group that was not subject to the pre-test includes n 4 =9 individuals, the mean of their post-test scores is 24.22 and it is associated with a standard deviation of 3.83.Also in this case, the mean or average value of the post-test scores of the participants that were not exposed to the pre-test is slightly higher than the average score of the participants that were in fact exposed to the pre-test.Therefore, also in this case one would expect that the difference is actually zero or minimal.Since for this distribution, t=2.00 is below the critical value of t=2.110 obtained from statistical tables (two tails, p=0.05).It is safe to say that there is no statistically significant difference between the post-test scores of the control group that was exposed to the pre-test and the post-test scores of the control group that was not exposed to the pre-test.Since the Student t-test analysis both for the control and the experimental group evidenced that there was no statistically significant difference between the two groups, it is appropriate to analyze the data as two homogeneous distributions: an experiment distribution and a control distribution.
Since it was not possible to ascertain any statistically significant difference between the nonequivalent groups, both experiment and control, exposed to the pre-test and the ones that were not exposed to the pre-test, the scores of post-test data for both the control and the experimental groups were aggregated into two separate distributions.A first distribution, that can be referred as to an experimental distribution, including the data of the post-test collected from both groups of students exposed to the pre-test (O 2 ) or not (O 5 ).A second or control distribution built from the data of the control groups both with and without pre-test (O 4 +O 6 ).The total number of data points analyzed was therefore n=49 for the experimental distribution (O 2 +O 5 ) and n=20 for the control distribution (O 4 +O 6 ).The term distribution is used for the groups of data (O 4 +O 6 ) and (O 2 +O 5 ) to underline the fact that these were not physical groups of students performing an experiment, rather groups of numbers (distribution) generated as the aggregate of the experimental groups.
The experimental distribution generated an average post-test score value of 22.04 associated with a standard deviation of 4.17.The control distribution generated an average post-test score value of 22.63 associated with a standard deviation of 3.48.A t test was performed to compare the two distributions and generated t=0.6, well within the critical tabulated value of t=1.994 for 67 degrees of freedom and p<0.05.No significant statistical difference could be found between the two distributions.This is probably the most important finding of this study.The mean of the post-test scores of the experiment distribution and the one of the control distribution resulted statistically indistinguishable.There are several possible explanations for that.A first possible explanation is that both groups studied from the same material, forming for example study groups during the week and without reporting that to the faculty administering the experiment.Students were warned as not to mix in between the groups, but faculty had no power, other than personally interviewing students to verify that.When ask students consistently reported to have been studying independently.Another possibility is the use of the same, or similar study material.While the instructions for the computer simulations were in fact only distributed to the experiment group both groups had access to the class textbook [25].In addition to that both the experiment and the control group had access to the internet as a potential source of information.A possibility that cannot be ruled out is that both groups accessed similar material while studying, although because of the interaction with the instructor(s) during the two weeks allowed for the project, it seem that in good faith the experimental group did in fact performed their simulation as learning tool.A third and possible explanation is contamination at the time of the test, although this would be unlikely since the post-test was administered in a controlled environment and mentored by a faculty.
The difference between the means of the two distributions 0.39 is well below the standard deviations associated with each of them, which were in the range of four.The comparison between the normalized data is illustrated in Figure 1.From the comparison, it is possible to observe that while both the control and the experiment nonequivalent groups show some major improvement due to the treatment.On the other hand, it is also possible to infer that the distribution of the post-test scores of the experiment and control nonequivalent groups are mostly occupying the same region of the chart: thus, having very similar indicators of center.
Since no statistically significant difference was observed between the control and the experimental distributions, the inquiry then shifted to analyze if there was any improvement in the test scores by comparing the values of the pre-tests both for the experimental (O 1 ) and the control distributions (O 3 ) with the values of the post-tests for the experimental (O 2 +O 5 ) and the control (O 4 +O 6 ) distributions.The distribution of the pre-test scores of the experimental group (O 1 , n=38) turned an average value of 9.63 associated with a standard deviation of 4.04, while the distribution of the pre-test control group (O 3 , n=9) supported an average of 8.00 associated with a standard deviation of 3.93.The two means fall within the one sigma interval from each other, therefore still one could not identify any statistically significant difference between the two.This sets a common baseline at about 9.32 ± 4.03.In fact, the average value of the overall scores for all the pre-test scores (O 1 +O 3 ) (n=47) was calculated to be 9.32 ± 4.03.This average was substantially different from the average of the post-test for the experimental (O 2 +O 5 ) and control (O 4 +O 6 ) distributions, showing that substantial improvements were actually made by students belonging to both groups.Calculating the relative improvements of both the control and the experiment distribution separately though it is possible to point out a 128.8% improvement of test scores of the experiment distribution vs a 182.8% relative improvement of the control distribution.This could be due to a different factor including student selection, availability of certain study material to both groups, possible contamination and others.While the student selection was in fact randomized for each cluster, since not all the students participated it is possible that only certain students decided to move on with the study and that those students that in fact participated set, for example, a lower baseline for the control distribution.Another possibility is that the students that participated in the control distribution felt a greater need to improve than the ones of the experimental distribution thus achieving better overall improvement.
A separate possibility to take into consideration could be that the material for the guided learning approach was not clear.To rule out this

Paper on Education
Orbital: Electron.J. Chem.9 (4): 299-307, 2017 306 possibility a statistical analysis of the post-test MCQs was performed looking at the mean success for each and every question and how the success rates of each MCQs would differ from each other.Out of the 66 participating students as an average 42 guessed the correct answer of each MCQ, with a standard deviation of 12.18.A detailed analysis of the MCQs is reported in Table 3.Using the criteria of µ± 3σ to judge whether a MCQ was within or outside the interval of normality only Q(10) fails the test on its lower border of µ-3σ.Therefore, it is unlikely the students exposed to the guided learning approach were substantially confused, and lead outside the range of normality by the study material developed for this study.

CONCLUSION
The present study supports the idea that PBL can be used in a college classroom to enhance students' learning of chemical kinetics.By performing computer based simulations of a simple reaction system students can master their understanding of basic chemical concepts.The scores on a MCQs post-test earned by students studying on a PBL module resulted, in fact, as good as the ones of the students studying on a traditional textbook.In addition to that students become more aware of their computer skills, and they had an opportunity to realize the time and effort they need to invest to produce measurable progress in a simulated research environment.This experience also underscored the importance of having reliable reference material.PBL in fact literally scaffolds on students prior knowledge but it also must be corroborated by mentorship and supervision of an instructor.Overall the long term effect of PBL requires more investigation, and several questions remain open, as for example the percentage of classroom time to dedicate to PBL instead of or in addition to traditional teaching and learning techniques.

Figure 1 .
Figure 1.Comparison between the normalized pretest scores of the experimental (O 1 ) and control (O 3 ) groups, characterized respectively by an average of 47.93±23.77and 38.50±17.17,and the normalized post-test scores of the experimental (O 2 ) and control groups (O 4 ) characterized respectively by an average of 71.72±18.83and 70.67±8.43.

Table 1 .
Solomon four group research design as it applies to the present study.

Table 2 .
Output of the analysis of the variance (ANOVA) performed with Microsoft Office 365 ProPlus.

Table 3 .
Analysis of the scores of the post-test.