A Randomized Study to Evaluate the Effect of a Nudge via Weekly E-mails on Students’ Attitudes Toward Statistics

Abstract Can a “nudge” toward engaging, fun, and useful material improve student attitudes toward statistics? We report on the results of a randomized study to assess the effect of a “nudge” delivered via a weekly E-mail digest on the attitudes of students enrolled in a large introductory statistics course taught in both flipped and fully online formats. Students were randomized to receive either a personalized weekly E-mail digest with course information and a “nudge” to read and explore interesting applications of statistics relevant to the weekly course material, or a generic course E-mail digest with the same course information, and no “nudge.” Our study found no evidence that “nudging” students to read and explore interesting applications of statistics resulted in better attitudes toward statistics. Supplementary materials for this article are available online.


Introduction
Increased use of technology for elements of course delivery raises many questions about its effect on student achievement and engagement. In considering these questions, statistics instructors have experimented with a variety of course delivery methods including blended, flipped, and fully online courses. Blended courses use a combination of face-to-face and online delivery; courses are often categorized as blended when the proportion of course material that is delivered online is between 40% and 80% (Boettcher and Conrad 2016, p. 11). In flipped courses, course content is delivered through asynchronous online components and face-to-face meetings are used for active learning, such as discussion and problem-solving in groups (Boettcher and Conrad 2016, p. 12). For introductory statistics courses, a number of studies have investigated whether these different delivery methods result in different learning outcomes (e.g., Utts et al. 2003;Wilson 2013;Gundlach et al. 2015;Touchton 2015;Peterson 2016;Nielsen, Bean, and Larsen 2018), whether they can lead to improved retention (Winquist and Carlson 2014), and if different methods have an effect on student attitudes toward statistics (e.g., Gundlach et al. 2015;Loux, Varner, and VanNatta 2016).
A positive relationship between attitude and achievement has been demonstrated in many studies (Emmioglu and Capa-Aydin 2012), and perceived usefulness has been demonstrated as a factor related to student engagement. In addition to possible differences among delivery methods, students' engagement in introductory statistics has been shown to be related to the perceived usefulness of statistics (Hassad 2018). As a conse-quence, many introductory courses have focused on statistical literacy. Statistical literacy can be defined as " [p]eople's ability to interpret and critically evaluate statistical information and databased arguments appearing in diverse media channels, and their ability to discuss their opinions regarding such statistical information" (Gal (2000) as cited in Rumsey 2002 andHassad 2018). Statistics instructors often present interesting media stories, that contain statistical information, as a strategy to help students engage in statistical concepts, demonstrate the usefulness of statistics, and promote statistical literacy.
Some instructors have inserted elements of fun in their statistics courses in the hopes of reducing student anxiety, improving attitudes toward statistics and improved student achievement (Lesser, Pearl, and Weber III 2016). Types of fun include cartoons, games, and quotations (Lesser and Pearl 2008)

Choice Architecture and Nudging
"A choice architect has the responsibility for organizing the context in which people make decisions. " "A nudge ... is any aspect of the choice architecture that alters people's behavior in a predictable way without forbidding any options or significantly changing their economic incentives. To count as a mere nudge, the intervention must be easy and cheap to avoid. Nudges are not mandates. Putting the fruit at eye level counts as a nudge. Banning junk food does not. " (Thaler and Sunstein 2008) Thaler and Sunstein (2008) developed the theory of nudging based on work by the psychologists Daniel Kahneman and Amos Tversky that began in the 1970s (e.g., Tversky and Kahne-man 1971;Kahneman and Tversky 1979) and eventually developed into the field of behavioral economics. Kahneman and Tversky discovered that individuals "have erroneous intuitions about the laws of chance" (Tversky and Kahneman 1971), and assess loss and gain asymmetrically (Tversky and Kahneman 1991), which is inconsistent with a key assumption in classical economic theory which posits that consumers behave in a rational manner (Tversky and Kahneman 1991;Kahneman 1994).
Amelioration of decision-making in cases where people make "... decisions that are difficult and rare ... do not get prompt feedback, and when they have trouble translating aspects of the situation into terms that they can easily understand" is central to nudging (Thaler and Sunstein 2008). The low cost of nudging people in a direction is a major appeal, and part of the reason that more than 200 government teams around the world are developing, testing, and scaling behavioral interventions, although education was one of the last areas to receive attention from behavioral scientists (Oreopoulos 2020).

Nudging in Education
In a review of nudging interventions in education, Damgaard and Nielsen (2018) conclude that the "greatest effects often arise for individuals affected most by the behavioral barrier targeted by the intervention. " Page, Lee, and Gehlbach (2020), describe the results of a randomized study of a persistencefocused chatbot to provide outreach and support to undergraduates at Georgia State University that was effective. A study by van Oldenbeek et al. (2019) used an alternating treatment design to evaluate the effect of E-mail-based progress feedback in a blended course, and found that nudges were effective in increasing the number of minutes viewing class videos. In the context of learning analytics, there have been recent studies showing positive benefits of nudging through targeting E-mails to students whose engagement in course resources through the learning management system is below average (Fritz 2017;Lawrence et al. 2021;Brown et al. 2022). Oreopoulos (2020) provides an overview of nudging in education.
Instructors presenting engaging material such as media stories and elements of fun must consider how they make these materials available to their students. When instructors design course learning objectives, select topics to achieve these objectives, and create assessments, they create choices for students to become engaged in course material. A choice architect is someone who can indirectly influence the choices other people make (Thaler and Sunstein 2008). A course instructor is an example of a choice architect. Choices presented to students in a course can indirectly influence the choices that they make to engage with the material (e.g., students can choose to do some external reading related to course topics). Most people will choose the default option-an option that can be obtained if the chooser does nothing. This is the option that requires the least effort. Can instructors influence this behavior? More specifically, can students in an introductory statistics course be "nudged" toward having better attitudes toward statistics by changing the default options students have to engage with the material?

Randomized Studies in Education
Carrying out a randomized control trial is the gold standard in assessing the influence of an intervention, and requires, in the case of single treatment, randomly assigning some study participants to a treatment group and the remainder of the study participants to a control group. Most courses operate on the principle that all students in a course will have the exact same course resources available in the natural learning situation of teaching in groups, such as lectures or seminars (Dreyhaupt et al. 2017). Thus, in educational settings it is often the case that it is neither practical nor ethical to randomly assign students to treatment or control groups, even though they "deliver more convincing results" (Dreyhaupt et al. 2017). A notable exception is the randomized study carried out by Lesser, Pearl, and Weber III (2016) which showed that students interacting with online content including an element of fun (a song) responded correctly more often to a related assessment question, but did not show improvements in attitudes.
In this article, we describe the impact of a randomized intervention that was implemented in a large multisection introductory statistics course taught in both flipped and fully online formats designed to nudge students toward developing better attitudes toward statistics. The study's motivation was to address some negative results we had observed in changes in students' attitudes, particularly for students in fully online sections of the course when compared to students taught in the flipped format (Gibbs and Taback 2018). Through a randomized design, we investigated whether the use of a "nudge" could lead to measurably improved attitudes if the "nudge" engaged students in "interesting" and "entertaining" material related to, but beyond, the scope of the course.

Background
The Practice of Statistics I (STA220H1F) is a large (1300+ students in 2015 and 1600+ students in 2016, in sections of 200-400 students) multisection introductory statistics course at the University of Toronto. At the time, it could be used as a first course for students who go on to complete a program of study in statistics, however, approximately 95% of the students in the course were not enrolled in a statistics program of study and were using the course as the sole statistics course for another program of study, such as life sciences. In the fall of 2014 two of five sections of the course were taught in a flipped style as a pilot. In fall 2015 and 2016 all sections were taught in a flipped style and two fully online sections were added to the course. All sections used the same online materials with a common final exam. The common online materials included video lectures, R Shiny apps with accompanying learning activities, not-for-credit multiple choice concept checks, and for-credit quizzes. In the flipped sections, in-person class time was used for mini lectures to address misconceptions, group problemsolving, and peer instruction to build conceptual understanding (Crouch and Mazur 2001). In parallel to the in-person activities of the flipped sections, the fully online sections sup-plemented the online material with synchronous and asynchronous discussions about common misconceptions, asynchronous group problem-solving, and synchronous tutorials. The synchronous tutorials consisted of just-in-time teaching, learning activities, and small group work. Discussions in online tutorials were initiated by both students and instructors.
In fall of 2015 we collected data on students' attitudes toward statistics. Most dimensions of students' attitudes became worse by the end of the course, and were statistically significantly lower in the online sections of the course (Gibbs and Taback 2018). While observing such a decline is not unusual (Schau and Emmioglu 2012), we sought strategies to better support the development of positive attitudes, particularly for students in the online sections. In fall 2016, weekly E-mails to students about the course were tested to see if their use could positively influence student attitudes.

The Intervention and Study Hypothesis
Given the evidence that perceived usefulness of statistics leads to more positive attitudes and that positive attitudes lead to better achievement, we considered how as instructors we might take advantage of our role as choice architects to improve student perceptions. In particular, we asked: Can students in an introductory statistics course be "nudged" toward having better attitudes toward statistics by changing the default choice students have to engage with the material? We thus investigated whether students who receive a nudge to read and explore additional entertaining and engaging information related to course topics develop better attitudes toward statistics. If students form an appreciation of the relevance of statistics in everyday life, would this lead to better attitudes toward statistics? In this study, we evaluated the hypothesis that students who develop an appreciation of the relevance of statistics in everyday life will have better attitudes toward statistics.

Weekly E-mail Digest
The primary aim of this study was to test if nudging students to read and explore interesting and fun real-life stories involving statistics would result in better attitudes toward statistics. But, it's not feasible to force students to read these stories or randomly assign, say, only half the class to read these stories. But we could present half of the students with a nudge toward these in an Email.
The nudge consisted of web links to news items, blog posts, or videos that connected weekly topics to real world applications, relating statistical concepts covered in the course to real data with context and purpose, even though these examples were sometimes not covered in the course. "Using real datasets of interest to students is a good way to engage students in thinking about the data and relevant statistical concepts ... reflections of students who used real data in a statistics course ... found the use of real data was associated with students' appreciating the relevance of the course material to everyday life. Further, students indicated that they felt the use of real data made the course more interesting" (GAISE College Report ASA Revision Committee 2016).
In fall 2016, STA220H1F students were randomized to receive a personalized weekly E-mail digest with a nudge and routine course information or a generic weekly E-mail digest without a nudge and the same routine course information.
The weekly E-mail digest with a nudge was considered the active treatment, and the E-mail digest without a nudge the control treatment. Routine course information included timely course information, such as weekly topics, weekly homework assignments, and upcoming milestones.
Analyses of 2015 STA220H1F data (Gibbs and Taback 2018) showed that attitudes and final course grades differed by cumulative grade point average (CGPA). As a consequence, the 2016 randomization was stratified by performing a randomization separately within each stratum of CGPA and lecture section to ensure that the study was balanced by CGPA within each lecture section.
Data on students' CGPA, sex, and lecture section were obtained from institutional student records.
The study was approved by the University of Toronto research ethics board. Students were asked to review a consent form where they were required to opt in to the study, and would receive an incentive mark of 1% to their final grade for completing pre-and post-SATS-36 (see Consent Form section in Appendices, supplementary materials). Table 1 outlines the components of the weekly digest sent to students, and shows the differences between the two versions. The main feature of the weekly digest with a nudge is the connection between the statistical topics being studied in class and current real world applications that involve these topics, although, as noted in Table 2, they also often included some modalities of "fun" (Lesser and Pearl 2008). The goal was to "nudge" students receiving this version of the weekly digest to explore the connections of what they were learning to the real world, and encourage independent exploration. Table 2 shows the sources and specific examples used to create the sections of the weekly digest with a nudge. These sources were also used to create modern real world examples of statistical concepts and techniques for the class. Figure 1 gives an example of an E-mail digest with a nudge and the corresponding E-mail without a nudge. Additional examples of the weekly E-mail digest are included in the Appendices section, supplementary materials. E-mails were sent using Mailchimp in order to facilitate mass E-mails and tracking of whether students opened the E-mails.

SATS-36
Student attitudes were measured using the Survey of Attitudes Toward Statistics SATS-36 (Schau 2003). SATS-36 was chosen  among the many surveys designed to assess student attitudes toward statistics because, among these surveys, the strongest evidence of construct validity and internal consistency exists for SATS-36 (Nolan, Beran, and Hecker 2012) and it is very widely used (Schau and Emmioglu 2012). SATS-36 provides measures of student attitudes in six components, defined by Schau (2003)  as 1. Affect-students' feelings concerning statistics 2. Cognitive competence-students' attitudes about their intellectual knowledge and skills when applied statistics 3. Value-students' attitudes about the usefulness, relevance and worth of statistics in personal and professional life 4. Difficulty-students' attitudes about the difficulty of statistics as a subject 5. Interest-students' level of individual interest in statistics 6. Effort-amount of work the student expends to learn statistics The score for each component is the average of 7-point Likert items, where higher scores correspond to more positive attitudes. SATS-36 was administered at the beginning of the course (SATS-36 pretest) and at the end of the course (SATS-36 posttest). Schau and Emmioglu (2012) indicate that a pre-post change of at least 0.5 is practically meaningful.

Results
All statistical analyses were conducted using R version 4.2.

Weekly Interesting E-mail versus Plain E-mails
In 2016, there were 1611 students enrolled in the course. Of these students, 1430 (89%) consented to take part in the study; 703 (49%) students were randomly assigned to the "nudge" group, and 727 (51%) students to the "no nudge" group. Among the 1430 students who consented, 1082 (76%) students completed all the SATS-36 questions. All analyses are based on the 1082 students who completed all SATS-36 questions. Thus, analyses use data from 67% of the students enrolled in the course (see Figure 2). Table 3 gives demographic and academic information on the consenting students who had completed all SATS-36 questions within each of the weekly E-mail digest groups, which are comparable. The random assignment to type of weekly E-mail digest was effective in balancing students by CGPA, sex, year of study, and lecture section. Tables 4 and 5 show the completion rates within each E-mail group by pre and post SATS-36 and those that completed both pre-and post-surveys. The individual SATS-36 components show similar distributions across the two e-mail groups and pre-and post-surveys. The pre-SATS-36 survey was completed by almost all students that consented, with completion rates ranging from 95% to 99%, and post-SATS-36 had completion rates ranging from 76% to 81%.
Missing data patterns were computed using the md. pattern() function from the mice library in R (van Buuren and Groothuis-Oudshoorn 2011). There are 11 and 9 missing data patterns in the pre-and post-surveys, respectively. The most frequent missing data pattern in the pre-and postsurveys is missing items required to compute all the SATS-36 components (pre-SATS-36: n = 38 (3%), post-SATS-36: n = 302 (21%)).
When both pre-and post-surveys are pooled together, there are 23 missing data patterns. The two most frequently occurring missing data pattern in this pooled dataset were among students completing all pre-SATS-36 items required to compute the six components, but not completing all the items required for any of the post-SATS-36 items (n = 278, 19% of randomized students), and students completing all the pre-SATS-36 items required to compute all the post-SATS-36 components, but not completing all the items required to compute any of the post-SATS-36 components (n = 22, 2% of randomized students). The other 22 missing data patterns occurred in less than 2% of randomized students.   Figure 3 shows the distribution of scores from the pretest components of SATS-36 by weekly E-mail digest group.
The majority of students were female, had a CGPA of B or C, had positive attitudes toward statistics at the start of the course, and the distribution of pretest SATS-36 scores are similar in each group across the six SATS components. Data on which respondents opened any E-mail at least once were available from MailChimp, the E-mail platform that was used to send the weekly E-mails. 621 (57%) students opened the weekly E-mail at least once. The distribution of students who opened at least one E-mail by E-mail group is shown in Table 6. A greater proportion of students in the nudge group opened at    least one E-mail (61% in the interesting E-mail group vs. 54% in the no nudge group). Following Millar and Schau (2010), SATS-36 gain score (the difference between post-and pre-SATS-36 score) was modeled with an adjustment for pretest score. A linear model was fit to evaluate the effect of the E-mail intervention, using the R function lm(), with gain score for each attitude component as the response and with the following covariates: pretest SATS-36, cumulative grade point average, sex, lecture section, E-mail group, and an indicator for opening the course E-mail at least once. The least-squares means of the gain scores, obtained from the lsmeans() function from the lsmeans library (Lenth Table 7. Adjusted mean course performance by weekly E-mail digest group.

Weekly E-mail digest group
Opened at least one E-mail  Figure 4 shows the adjusted confidence intervals for the mean difference between the nudge and no nudge groups in the changes in attitudes. There were modest, not statistically significant differences in means between the E-mail groups. The results of this analysis are similar when conducted on the subgroup of students that opened an E-mail at least once.
The analysis in Figure 4 that computed the adjusted leastsquares means of the gain scores was repeated using the R function mice() from the mice library (van Buuren and Groothuis-Oudshoorn 2011) to impute missing values for missing SATS-36 components in the pre-and post-surveys. The results of this analysis are similar to those shown in Figure 4 indicating that the results are not sensitive to missing pre-SATS or post-SATS data.
The multivariate models for the gain scores had adjusted parameter estimates for opening an E-mail at least once that ranged from −0.1 to 0.0 with corresponding p-values that were at least 0.45. Thus, there is no evidence that opening an e-mail at least once was predictive of the gain scores for the SATS-36 components. Table 7 shows the least squares means of the final course marks adjusted for cumulative grade point average, and lecture section within each E-mail group and indicator for opening the course E-mail at least once, and a comparison between these means for the nudge and no nudge groups. The marks were out of 100. The difference between the E-mail digest groups among the subgroup of students that opened the E-mail at least once is larger compared to the difference among students that never opened the E-mail, although both differences are very small. Table 8 shows the means and standard deviations for each SATS-36 component separately for each E-mail group, for both the pre-and post-tests. Modest decreases in attitudes were observed, but with minimal differences by randomization group. Most components had a decrease in mean score, except for the difficulty component, which had no change or a minimal increase in mean score. While decreases in mean scores for the components interest and effort were practically meaningful with declines greater than 0.5, the differences in these decreases between the no nudge and nudge E-mail groups were neither practically nor statistically significant.

Limitations
Several limitations related to the design of this study are important, and may have impacted the results we observed.
Even though the design allowed us to carry out a randomized study, adding interesting content to weekly E-mail digests as a "nudge" to encourage engagement may not have been a strong enough intervention to see an effect. "Fun" and "interesting" are subjective, and what the teaching team decided was "fun" and "interesting, " may not have been perceived as such by students.
Students were randomized to receive a nudge (active treatment) to read and explore applications of statistics in real life through a weekly e-mail, but our data show that only slightly more than half the students actually opened the weekly e-mail. Students in the nudge E-mail group opened the E-mail more often compared to students in the no nudge E-mail group (control treatment). If we assume that approximately half the students not only opened but also read the information in the Email, then this may bias the effect on attitudes toward the group that received a nudge. Students may have opened the E-mail, but not clicked on the links and read the content, which may bias the effect on attitudes toward the group that didn't receive a nudge. There are several variations of these potential biases: students who received a nudge clicked on the link, but didn't read the content; students that didn't receive a nudge sought out interesting stories about statistics in everyday life; or a student who received an E-mail with a nudge might have forwarded it to another student in the class who received an E-mail without a nudge. These potential biases illustrate the myriad reasons why students may not have followed the nudge or received a nudge even though they were randomized to the no nudge group.
It was not feasible for the study to collect data on clicks and time spent on links provided in the nudge group. Finally, our data on E-mail opening may underestimate the number of students who opened an E-mail. The E-mail client MailChimp uses Web beacon trafficking which relies on a hidden graphic embedded in the E-mail, and won't work if, for example, a student chose not to display images (MailChimp n.d.). It is difficult to know what, if any, effect these treatment uptake limitations might have on a students' attitudes toward statistics.

Conclusions
Our findings are consistent with Lesser, Pearl, and Weber III (2016) who found no evidence of an effect on inserting a song on student attitudes as measured by SATS-36. Other studies have resulted in the conclusion that attitudes toward statistics are stable and difficult to change in a single course, although "... a few researchers have reported modest success at improving student attitudes in individual courses" (Pearl et al. 2012). Gal, Ginsburg, and Schau (1997) "caution that studies of pre-post course change as measured by such instruments, often show little change, perhaps indicating the stability of the factors studied and the accompanying resistance to change. " In discussing the promises and limitations of nudging in education Oreopoulos (2020) states that, "Many recent attempts to test large-scale low touch nudges find precisely estimated null effects, suggesting we should not expect letters, text messages, and online exercises to serve as panaceas for addressing education policy's key challenges. " Engaging many students in one introductory course is challenging. Including elements of remote instruction further adds to the challenge. Frequent communication is a commonly suggested best practice in remote instruction (see, e.g., Darby 2019, p. 87). Regular E-mail or other electronic methods of communicating with students provide simple, convenient methods for instructors to engage a large number of students simultaneously. Our study suggests that including applications of statistics that some might view as "interesting" or "amusing" may not result in measurable benefits in shifting short-term attitudes toward statistics, and instructors considering making this extra effort to enhance their communications may not observe a change in attitudes by the end of their course.