Designing a Large, Online Simulation-Based Introductory Statistics Course

Abstract We designed an asynchronous undergraduate introductory statistics course that focuses on simulation-based inference at the University of Nebraska-Lincoln. In this article, we describe the process we used to design the course and the structure of the course. We also discuss feedback and comments we received from students on the course evaluations, and we reflect on the course after teaching it for the past three years. Our goal is to provide useful tips and ideas for instructors who have developed or are developing their own asynchronous introductory course. While we emphasize simulation-based inference in our course, we believe that many of the design features of this course would be useful for those using a traditional approach to inference in their introductory courses. Supplementary materials for this article are available online.


Introduction
The number of fully-and partially-online degree programs, certificates, endorsements, and minors offered by the University of Nebraska system has grown exponentially since 2010. However, in 2017, there was not a fully online version of an introductory statistics course available across the entire university system. As a fundamental component of many degree programs, this presented a roadblock for degree completion. To allow for more delivery mode and schedule flexibility for students, we were approached by University of Nebraska Central Administration in Spring 2017 and asked to consider developing a fully online, asynchronous section of introductory statistics.
We were in an enviable position. The University of Nebraska Central Administration had funding available and provided a generous grant to assist with course development. This allowed us to closely collaborate with an internal instructional designer, Sydney Brown, as well as hire an outside firm for professional design of the online course. We were given ample time (about a year) to design, refine, and pilot the course before a full launch. We realize these are ideal circumstances, and likely do not reflect the position of most instructors developing an online course. We hope our experiences, particularly what we learned from our instructional design partners, will help those designing or refining an online course.
The course we describe here uses the simulation-based approach to introduce concepts to students, a distinction from Rayens and Ellis (2018), and adheres to best practices for online courses by theoretically grounding the course in the

Community of Inquiry
While we were developing the course, we considered the Community of Inquiry framework (Garrison, Anderson, and Archer 2000; also see Castellanos-Reyes 2020). The framework identifies important elements of online courses in higher education (Garrison, Anderson, and Archer 2000). In this framework, learning happens through three interconnected presences: cognitive, social, and teaching presence.
Cognitive presence refers to "the extent to which the participants in any particular configuration of a community of inquiry are able to construct meaning through sustained communication" (Garrison, Anderson, and Archer 2000, p. 89). Cognitive presence is important in order for the process of critical thinking to occur. According to a practical inquiry model proposed by Garrison, Anderson, and Archer (2000), the process of critical thinking has four phases: triggering event (e.g., problem identification), exploration (e.g., critical reflection and discourse), integration (e.g., constructing meaning), and resolution (e.g., implementing solution from integration phase).
Social presence refers to an individual's ability to affectively and socially project themselves, which allows their peers and instructors to perceive them as "real people" (Rourke et al. 1999;Garrison, Anderson, and Archer 2000). There are three major categories of social presence: affective responses (e.g., using emoticons), interactive responses (e.g., explicitly referencing others' responses), and cohesive responses (e.g., referencing others by name; Rourke et al. 1999).
Teaching presence refers to "the design, facilitation, and direction of cognitive and social processes for the purpose of realizing personally meaningful and educationally worthwhile learning outcomes" (Anderson et al. 2001, p. 5). There are three major categories of teaching presence: design and organization (e.g., establishing netiquette), facilitating discourse (e.g., acknowledging student contributions), and direct instruction (e.g., presenting content; Anderson et al. 2001).
The presences are interconnected (Garrison, Anderson, and Archer 2000). For example, selecting content can be considered both cognitive and teaching presence, whereas supporting discourse can be considered cognitive and social presence. Additionally, setting climate can be considered both social and teaching presence. The manner in which the CoI framework informed our online course design is addressed in Section 3.

Overview
The non-calculus introductory statistics course at the University of Nebraska-Lincoln is often referred to as "Stat 101" in the literature (e.g., Tintle et al. 2018;VanderStoep et al. 2018). At our institution, approximately 550 students from a wide variety of majors enroll in the course each semester as it fulfills a quantitative reasoning general education requirement. There are usually a dozen sections of the course offered each semester. Both graduate students and faculty teach these sections, and each instructor has some freedom in the design of their section. For example, each instructor can choose the grading scheme and type of assessments given in their course.
There are many commonalities across sections, though. To begin, all sections introduce students to both simulationbased inference and theoretical approximations. Generally, the simulation-based approach is used to introduce students to the new concepts, and once students have gained an intuition for the material, we broadly introduce the traditional methods. Our approach follows similar to Tintle et al.'s (2016) textbook, as we use this textbook with the WileyPlus integration in the learning management system (Canvas). This course recently started participating in a campus-wide initiative (Successful Teaching with Affordable Resources), which strives to make textbooks affordable for students. With this initiative, students do not have to purchase a textbook or access code themselves. All students who are enrolled in the course have access to the online textbook and WileyPlus platform, and the reduced price of the textbook is billed through student accounts. A second commonality is that the Guidelines for Assessment and Instruction in Statistics Education (GAISE; GAISE College Report ASA Revision Committee 2016) are addressed in every section. Each section addresses these guidelines slightly differently; however, there are common techniques. For example, the textbook's applets are often incorporated in the courses to conduct both simulation-based and theoretical methods for hypothesis testing, which addresses the fifth guideline (i.e., "use technology to explore concepts and analyze data"). Similarly, examples within Tintle et al.'s (2016) textbook are primarily rooted in real-world data which is one way the third guideline (i.e., "integrate real data with a context and purpose") is addressed.

Development Process of the Online Course
The development of the course started about a year prior to the full implementation of the course. There were two statisticians (Ella Burnham and Erin Blankenship), an internal university instructional designer (Sydney Brown), and a few external instructional designers who were involved in the process. The development process began by Burnham and Blankenship determining the learning goals for the course. Then, we aligned the schedule, assignments, and reading reflection questions around the learning goals. We also identified when videos would be important but did not create these until the roll-out of the course. The instructional designers provided insight and suggestions on creating an effective online course. The external developers built much of the initial Canvas shell in a sandbox course. After the entire team felt the course was fully developed, there was a pilot section, with an enrollment capacity of 20 students, offered during the summer right before the full offering in Fall 2018. Burnham and Blankenship were the instructors of the course during the summer and chose a small enrollment capacity due to the potential for unforeseen challenges that they thought might arise during the initial offering of the course.
There were a few things that we learned during the development and pilot. First, we learned about the importance of organization in an online course. Second, a feature we would have liked to include in the design of the course was an interactive data submission; the idea was that students would see their data being drawn onto a plot as they submitted their data point. However, there was not a simple implementation of interactive graphics. Third, we learned that not every student reads written feedback on homework assignments; therefore, we decided we would provide feedback through videos as well. More details about the organization and feedback are provided in upcoming sections.
Following the pilot, there was a quality check conducted on the course using the Open SUNY Course Quality Review Rubric (OSCQR; Online Learning Consortium, Inc n.d.). OSCQR makes use of the OLC (Online Learning Consortium) Online Course Quality Framework which "provides a process, tools, and resources to systematically review, refresh, and continuously improve online courses" (Online Learning Consortium, Inc n.d.). The rubric is additionally informed by the CoI model along with several other sources representing evidencesupported practice in online teaching.
The OSCQR rubric has 50 quality indicators distributed into six sections: 1. Course overview and information 2. Course technology and tools 3. Design and layout 4. Content and activities 5. Interaction 6. Assessment and feedback.
An independent instructional designer used these indicators as a checklist before the launch of the course to set a baseline standard of quality that conforms to accepted best practices for online instruction.

Online Course Structure
The enrollment capacity for the online introductory statistics section during Fall 2018 through Spring 2021 was 90 students. The section was fully enrolled by the start of each semester. The course had between three to four instructors each semester, two of which were always Burnham and Blankenship. We viewed every member of the instructional team as an equal partner (i.e., there were no "graders" as all of us were simply "instructors").
We structured the course material similar to a Monday, Wednesday, and Friday face-to-face course. On a typical nonexamination week, students covered a new section on Monday, and the assignments for Monday's material were due at 8 a.m. on Wednesday. The materials were generally accessible to students the weekday before that course day (e.g., Monday's material was made available to students the Friday before). Then, the new material for Wednesday was accessible on Tuesday and due at 8 a.m. on Friday of the same week. Thus, students generally had access to the materials and could complete their assignments during three weekdays. This is similar to the "moving window" that Rayens and Ellis (2018) discussed in their course design. The goal of this schedule was to provide frequent exposure of material to support student cognitive presence. One of the most frequent critiques from students during early semesters of the course was the minimal advance notice of assignments for upcoming days and weeks. To address this critique, we started publishing assignments, with restrictions, two to three weeks in advance. The restrictions prevented students from being able to access the content of the assignment in advance, but the due dates of the assignments appeared on students' Canvas calendars.
All of the course material was posted to Canvas or linked to within Canvas. We strove to organize the course content in a way that made it so that all material was easily accessible and only "one-click" away. The first module in the course was referred to as the "Start Here" module. This module provided an introduction to the course, and explained many of the expectations and elements of the course. For example, one page was entitled "Course Overview" and hosted the syllabus and instructors' introduction videos. Instructors incorporated immediacy behaviors and introduced themselves to their students, shared a few personal details about themselves, discussed some course expectations, and provided tips for being successful in the course during the introductory videos. Other pages provided tips for working with PDFs and Netiquette (i.e., etiquette for participating in conversation online). The goal of the Netiquette page was to help promote a safe and welcoming environment which would help support students' social presence. After students had viewed the "Start Here" module and syllabus, they were asked to complete a syllabus quiz. This quiz asked questions to help us form groups, as well as addressed course policies (e.g., "how many homework scores get dropped?") and provided tips for being successful in the course (e.g., "the best way to be successful in online Stat 218 is:"). The quiz did gradually evolve over the semesters, as we found additional items that we felt should be emphasized at the beginning of the semester. Refer to Appendix A in the supplementary materials for the questions on the most recent iteration of the syllabus quiz.
Since this course was asynchronous, there were no required weekly live interactions between the instructors and students. Much of the direct communication between the instructors and students happened through discussion boards. We asked students to post all non-grade related questions to the discussion board, similar to the discussion board expectations discussed by Rayens and Ellis (2018). If a student sent a direct message to one of the instructors with a non-grade related question, we encouraged them to post it on the discussion board. Students were also encouraged to respond to each other's questions to support the community of inquiry within the course, as the first student to provide a correct answer received an extra credit point toward their homework grade in the course. Even with this incentive, students rarely responded to others most semesters. In the syllabus, however, students were informed that the discussion board would be monitored by one of the instructors between 8 a.m. and 8 p.m. and should be responded to within a couple hours. To keep the workload manageable for the instructors, the instructors rotated days in which they were primarily responsible for responding to posts.
In addition to discussion boards, instructors relayed important information to students either through Canvas announcements or direct messages-and sometimes even both. We found that many students did not read the announcements; therefore, to relay the most essential information, Canvas messages were typically used. The instructors had one office hour per weekday and posted these hours as an announcement the Friday before that respective week. Students were asked to E-mail the instructor who was holding the office hour to indicate that they were going to attend. This prevented two students trying to meet with the instructors at the same time.
The instructors also created videos for the course, which supported our social and teaching presence. One common type of video was weekly videos, which often discussed the overarching topics covered that week and the schedule. Similar to the discussion board rotation, the instructors rotated who created the video each week. There were also several content videos created by the instructors. We will discuss these in more detail in the Section 3.4.
Since face-to-face sections involve a considerable amount of group work, we wanted to support the community of inquiry within the course by providing opportunities for students to interact; therefore, there were approximately 10 group assignments during the semester. The method for forming groups evolved over the years, but during the most recent semesters, groups were formed using questions from the syllabus quiz. We considered whether students had requested specific group members, and we tried to form groups of requested members first. Next, we considered when students typically completed assignments (e.g., as soon as it is assigned or at the very last minute) and whether they were interested in meeting with their group in real time (i.e., face-to-face or video). Finally, we con-sidered the time of day students typically completed their work (e.g., morning or afternoon) and the time zone they lived in. We tried to create groups with members that had similar responses to all of the questions but gave the timeliness of assignment completion and desire for face-to-face interaction more weight. With this technique, there was minimal group shifting during the semester; however, on occasion, there were slight shifts in groups when a group had several members drop the course.
When we notified students that groups had been formed, we let students know that they would have the opportunity to evaluate themselves and their group members twice during the semester-at the middle and end of the semester. This evaluation asked students about other group members' quality of work, timeliness of work completion, accuracy rate, and overall performance. The last question on the end of the semester evaluation was: "Regardless of what you've shared above, is there anyone in your group that you feel should not receive full participation credit? Please explain. " Students were given full credit for completing the mid-semester evaluation, but grades on the final evaluation were impacted by group members responses to the above question. This was still low-stakes, however.

Simulation-Based Inference
Since this course incorporated simulation-based methods for inference, students were generally asked to conduct a tactile simulation for each new test (e.g., one proportion, two proportions, and two means). A majority of the content videos that we created focused on some aspect of simulation. Some of the videos demonstrated the process for the simulation and connected the process to the research scenario. For example, a scenario in Tintle et al.'s (2016) textbook asks whether novice rock-paper-scissors players tend to throw scissors less often than would happen if the player was choosing between rock, paper, and scissors at random. This research question results in a hypothesized population proportion of one-third, which required students to use a spinner to simulate the null distribution. Therefore, during the video, the instructor demonstrated using an online spinner and explained the connection between aspects of the spinner simulation and the original rock-paperscissors scenario (the video is available in Appendix B of the supplementary materials).
Other videos were created after students had submitted their simulated statistics. The instructors discussed the null distribution that was created with their simulated statistics. One unforeseen challenge for having students conduct tactile simulations on their own is that we were unable to watch students as they did it. This proved challenging in two ways. First, if students did not understand how to simulate the data, we were not there to provide immediate help. Second, it was often obvious that at least some of the students were submitting data that was not simulated. There were two ways this became apparent. First, sometimes students submitted data that did not make sense in the context of the problem. For example, we asked students to play rock-paper-scissors 12 times and report the number of times their opponent played scissors. Some students reported impossible values, such as values above 12. Second, the simulated distributions from data collected from students often had too many extreme values and would have been very unlikely to occur if all of the statistics were properly simulated. We made sure to mention the irregularities of the null distributions in our videos. Unfortunately, we are unsure how to resolve these issues in an online course.
After the initial tactile simulation, students were generally expected to use the applets associated with Tintle et al. 's (2016) textbook to create their null distributions. These applets provided an efficient and quick way for students to create a null distribution so they could answer the research question. There is also an applet that focuses on inference using theoretical distributions, which we relied heavily on when discussing the theoretical approach for each test. With all of the applets, students did not always understand what to input to get an appropriate distribution. We answered numerous questions and diagnosed many issues that students had about the applets on the discussion boards. We found that the easiest way to diagnose an issue was to have the students post a screenshot of their "filled-in" applet.
Most of the day-to-day activities fell under the homework category. In order to support students' cognitive presence by providing multiple activities to practice the desired learning outcomes, we incorporated several types of activities. The two most common formative assessments were exercise assignments and explorations. Exercise assignments consisted of machinegraded questions from Tintle et al.'s (2016) textbook through the WileyPlus integration on Canvas. Students received immediate feedback and could attempt each problem up to three times.
Exploration assignments consisted of an overarching scenario or research question and asked open-ended or numeric questions. These exploration assignments were provided as fillable PDFs from Tintle et al.'s (2016) textbook. The exploration assignments varied dramatically in length; therefore, to ensure the workload of the instructors was manageable, feedback was provided on these assignments by either selectively grading a subset of questions or creating video recaps, typically within a couple days of the due date. For the first method, instructors went through the assignment and identified the questions most pertinent to the learning goals of that particular course day and provided individualized feedback to each student through the Canvas gradebook. The instructors often kept a Word document with frequently used comments, as students often had similar misunderstandings. This allowed the instructors to easily provide the same comment to each student who made that particular mistake. For the second feedback method, instructors created short videos that walked students through the explorations question-by-question. These videos were posted on the respective week's discussion board. Regardless of the type of feedback provided, students were typically given credit based on completion of the assignment instead of accuracy.
A couple other assignments that happened with some regularity were data collection and investigation assignments. Data collection assignments often asked students to carry out a tactile simulation to help them develop the intuition of a null distribution, as discussed using the spinner to simulate a null distribution with a population proportion of one-third. These assignments heavily encouraged students' cognitive presence. Students were given credit based on completion. Investigations were similar to explorations, except students were expected to complete these as a group. Only one file was accepted per group, and these assignments were worth twice as many points as the explorations. Instructors provided feedback on a subset of questions on these assignments, similar to explorations, but the selected questions were graded based on accuracy.
In addition to the day-to-day formative assessments, there was also a project in the course-the Article Critique Assignment. This project required students to summarize and critique either a popular press article (Fall 2018 and Spring 2019) or journal article (Fall 2019 through Spring 2021). The Article Critique Assignment was scaffolded and modeled to help ensure students understood the expectations. We introduced this assignment to students by providing two example critiques-one "good" (but not perfect) and one "bad" (although not terrible) example. Students were asked to grade and discuss the example critiques with their groups on their Canvas discussion boards. The instructors provided students with the rubric that they would be grading the students' article critiques with, which is what the students used to grade the example critiques. The rubric evolved over the semesters, especially as we transitioned from popular press articles to journal articles. Students were then asked to write an article critique themselves and post their draft to their group's discussion board. Shortly after, we asked students to provide meaningful and thoughtful feedback to help their group members improve their drafts. Based on these suggestions, students edited their critiques and submitted their final drafts. This assignment highlights students' cognitive presence, as the assignment required students to synthesize and communicate several big ideas from the course (i.e., think critically about the article). To keep the workload manageable in this high enrollment course, the instructors of the course chose between three to five articles for students to select from, instead of allowing students the freedom to choose any published article. In a smaller class, instructors may be able to allow students to find their own articles to critique. This assignment also highlights the importance of organization, as we learned that each incremental task needed a separate Canvas page in order to keep students on track for completing all components of the assignment.
There were also a total of five quizzes during the semester. The first four quizzes were similar to the exercise assignments, as each quiz was comprised of machine-graded questions from Tintle et al.'s (2016) textbook, but students could only attempt a question once. The last quiz of the semester was a modi- There were also two exams in this course. Prior to Spring 2020, students whose mailing address was within 35-miles of campus were required to take these exams at the Digital Learn-ing Commons (DLC) at the University of Nebraska-Lincoln. If students lived outside of this radius, they could find a proctor for their exams (e.g., librarian, educator, and counselor). Students were given a two-day window to complete the 90-minute exam. The test consisted of both machine-graded questions and openended questions. The open-ended questions typically focused on an overarching research scenario. One semester students were asked to use the applets on the exams; however, there were many issues with the combination of the lockdown browser and applets in the DLC. So, we reverted back to not requiring the applet. Students received feedback on the open-ended questions within two weeks of the exam due dates.
After the onset of the COVID-19 pandemic, the format of the exams shifted. Students were no longer required to either go to the DLC or find their own proctor. Instead, there were two parts of the exam-a machine-graded and open-ended portion. The first part consisted of entirely machine-graded questions and often focused on vocabulary and basic statistical concepts. This portion of the exam was timed, 30 min for Exam 1 and 45 min for Exam 2. Students only saw one question at a time. Each question was locked after students answered it for Exam 1. This exam focused heavily on vocabulary, and we hoped that locking questions after answering would deter students from looking up every question. Exam 2, however, focused more heavily on statistical concepts; therefore, the questions were not locked after answering on this exam. On both exams, students received a random subset of questions from larger question banks. The first question of Part 1 on Exam 1 and Exam 2 for all students, however, was: I understand that collaboration with anyone on this exam is not allowed. This includes homework help sites like Chegg, CourseHero, etc. I understand that such collaboration will result in a 0 on the entire exam. I further understand that collaboration is a violation of the UNL Student Code of Conduct and will also result in a conduct violation report filed with the Office of Student Affairs. By typing my full name below, I am electronically certifying that I understand this.
The second part of Exam 1 and Exam 2 consisted of openended and numeric questions, similar to the exploration activities. This portion focused on the application of the statistical concepts and required students to carry out analyses using the applets. For Exam 1, each student was given two overarching scenarios and asked questions leading them through the statistical investigation process for each. For Exam 2, each student was given only one overarching scenario. To help deter cheating on this portion of the exam, there were four versions of each exam. Since groups typically consisted of a maximum of four students, multiple students within the same group did not usually receive the same version. The second part of the exams was not timed, and students received feedback on these questions within two weeks of the exam due dates.
Students had a 33-hour window to complete each exam. Both parts of the exam opened at 8 a.m. on Day 1 and closed at 5 p.m. on Day 2. During this time, instructors only answered technology and logistics questions that arose. Even though both portions of the exams were open-book and open-note, students were still encouraged to study for the exams.

Description
At the end of each semester, students were sent information about the course evaluations. The course evaluations were given online. During Fall 2018 and Spring 2019, the course evaluations were hosted on a website outside of Canvas. There were seven open-ended questions relating to things such as the course content, assessments, and instructor. Beginning in Fall 2019, the evaluations were hosted within the course Canvas shell. The questions also changed considerably, as the university was adopting a common campus-wide evaluation. For Fall 2019, Spring 2020, Fall 2020, and Spring 2021, there were always at least the following two open-ended questions asked: • What has been beneficial to your learning? From the following list of teaching elements, what is the one thing that has been the most beneficial for your learning in this course so far? After your selection, please provide written comments about the element. • What could use some improvement? From the following list of teaching elements, what is the one thing that could most use some improvement to increase your learning? After your selection, please provide written comments about the element.
Some of the options were inclusiveness, course performance expectations, course learning materials and tools, quality interactions with students, and instructor communication.

Results
Across all five semesters (Fall 2018 through Spring 2021, excluding Spring 2020), the response rate was only 35.26% (i.e., only 140 students filled out the course evaluation out of 397). There were four overarching topics that appeared consistently in the responses to the open-ended questions: assignments, instructor communication, group work, and mode of learning.
The most frequent comments about the course were critiques relating to assignments and assignment frequency. Many students did not like that this course typically had assignments due every Monday, Wednesday, and Friday. Students felt the workload was too heavy and that the assignments were repetitive (i.e., "just busy work"). For example, one student felt that "this course had so so so much work and it was very hard to keep up. " We chose to have frequent assignments to keep students consistently engaged with the material. The high course engagement level ensured that students kept interacting with the material. We believe this helped students that might have dropped out otherwise. For students that did not need this additional structure, it was merely annoying but did not seem to impact their performance. Additionally, while students had more frequent assignments, we do not feel the workload was dramatically different than a typical face-to-face course. The most fundamental difference was that students were typically working through the exploration assignments by themselves instead of with a partner or group in a typical face-to-face course. There were some students, though, who appreciated the assignment structure. For example, one student wrote: "both the multiple choice homework questions and the pdf assignments make me feel like I am actively learning each week, and they are pretty helpful. " Additionally, many students desired the ability to work at their own pace, as they desired more flexibility in an online course than a face-to-face course. Rayens and Ellis (2018) discussed a similar critique from their students. As instructors and with the advice of instructional designers, we decided that following a strict timeline for this course was most beneficial for our students and allowed for students to be learning the material at the same pace. Other student comments related to the inability to have a "bad" day in the course; however, as instructors, we felt that dropping the lowest 10 assignments would be more than adequate for students to miss or skip days on occasion. Other students argued simply that they did not know due dates far enough in advance during early semesters. We addressed this by publishing the assignment between two to three weeks in advance. After we did this, we received minimal comments on the knowledge of due dates. One student even sent the instructors an E-mail and stated: "information was always available and well designed. I've taken many online classes and I wish they were all as clear as this. " Another topic that was frequently addressed in the course evaluations was group work. The comments on this were quite mixed. Many students strongly disliked the group work. For example, one student stated that "the least positive learning experience was the group investigations. It was the classic problem of group projects, some people doing more and some people doing less. " A few students, however, appreciated the group work. For example, one student stated that "by working with my group I was able to get help on things I didn't understand. " As instructors, we observed that many groups did not work together as we had hoped. One of our primary reasons for incorporating group work was to help students connect to each other in this distant environment and learn from one another. Many groups seemed to use the "divide and conquer" technique to complete their group work, however (e.g., one group member completed the first half of the assignment while another group member completed the second half of the assignment). Because of this, we regularly received group assignments that were not cohesive and had drastically different answer qualities throughout the assignment. There may be methods we could use in future semesters to help encourage better approaches for working as groups. For example, we could be more explicit in our goals and expectations of how effective groups typically work together and model an effective group conversation.
In addition to peer interactions, students' comments often addressed instructor communication. To begin, many students acknowledged and appreciated the timely feedback on assignments. Other students addressed direct interaction with the instructor on discussion boards. Many of these comments were positive. Several students felt as though the discussion boards provided a great resource for discussing various topics and were happy that the instructors answered questions on the discussion boards quickly. For example, one student wrote that "every instructor was very helpful, constantly responding to students questions on discussion boards and providing helpful feedback. " There were a few who would have preferred to E-mail questions directly to the instructors instead of posting on the discussion board, but there were relatively few students who commented on this.
Lastly, there were quite a few comments relating to the mode of learning in this course and the associated textbook. Many students disliked that much of the learning occurred through reading the textbook. They mentioned the desire for more videos. For example, one student stated that they "would prefer if material was presented by lecture videos. I believe this would help students be more engaged and it would make the material easier to learn. " However, the instructors of the course could see which students were watching the videos that the instructors created and whether the videos were watched in their entirety. We observed that many students did not watch the videos that we created; therefore, we are not convinced that students would have watched additional videos if we had created them.
Some students also commented on the textbook and its associated resources. The textbook and resources had quite mixed reviews from students. Some students found the textbook difficult to follow or overwhelming; however, other students appreciated the online textbook and associated resources (e.g., videos created for the textbook by Tintle). Most of the comments about the resources were positive. During the first few semesters, we did not feel as though we drew enough attention to the textbook's associated resources; therefore, we emphasized these in future semesters. This emphasis appeared beneficial as all of the feedback related to the additional textbook resources (which was mostly positive) was provided during Fall 2019 through Spring 2021.

Reflections
Reflecting on the past three years, the instructors of the course have some thoughts about the course design and delivery. First, we were fortunate to have external instructional designers on our development team. They were involved in this process as the university did not have the capacity on the instructional design team to support this project at that time. We appreciated their assistance on this project, but in the end, their contribution was not one of the main drivers of the success of this course. We believe that similar projects would be manageable without the external support.
Second, the introductory statistics course at our campus is well-subscribed. The course is typically fully enrolled before first and second year students have the chance to register for classes. Not unsurprisingly, offering the online course does not fix this enrollment issue. We feel that to provide a high-quality online experience we still need a similar student-instructor ratio. So, we do not believe that the online course is any more efficient than face-to-face courses.
Third, although students generally appreciated the discussion boards and our quick responses to their questions, we were hoping to observe more student-student interactions in the course. We wanted the discussion boards to be driven by students. As mentioned previously, we offered extra credit for correctly responding to another student's question, and at the beginning of the semester, the instructors delayed their responses on the discussion boards (e.g., waiting 2 hr before responding to a student's question). During most semesters, though, students were still not responding to each other after the first several weeks of the course. The instructors typically started responding within an hour of the student posting the questions at that point in the semesters. We are not sure what other methods we could use to encourage students to respond to one another, without making it a course requirement.
Fourth, office hours were not frequently used. While the specific times that office hours were hosted changed each week, many of our appointments with students happened outside of these hours. One technique we may incorporate in the future is on-demand office hours, as discussed in Rayens and Ellis's (2018) online course design. Instead of having set times every week, we would have students E-mail us appointment requests.
Fifth, we enjoyed the updated format of the exams after the onset of the pandemic. We appreciated that we could require students to conduct a more comprehensive analysis involving the applets during the open-ended portion of the exam, while also asking machine-graded questions in a timed fashion on the first part of the exam. Upon reflection, we are glad that we chose to not use an online proctoring service for our exams during Fall 2020 and Spring 2021. The major reason that we chose not to was the cost; we did not want to require students to spend additional money on the course. We also considered critiques we had heard about the services from instructors who had experience with online proctoring services. Students and faculty have raised concerns that online proctoring may even be biased based on gender and race (Hardesty 2018). For these reasons, we would choose this exam format in future semesters instead of using an online proctoring service.
Sixth, we were more accommodating with students if unexpected life events occurred after the onset of the pandemic, and we no longer required proof to make these accommodations. All students are human, and life can be quite unexpected. We had several students reach out to us for accommodations, but it was not an overwhelming amount of additional work for the instructors. We believe that these accommodations were important for students who had outside factors influencing them. As one student wrote in their evaluation during Spring 2021, "Without support, I would have failed at the beginning of those class. I got sick at the start of the semester and the teacher(s) were so understanding and supportive in me getting better as well as accommodating for what was going on at that time. It really made me want to continue the class and work hard. " We will continue these practices in future semesters.

Future Directions
There are a couple of compromises we made due to instructor workload constraints that we believe would bring improvement for students if they were implemented. The first is providing students with more individualized feedback (e.g., grading more questions on explorations and having open-ended quiz questions). We believe that these would positively impact the instructor teaching presence in the course (assuming the students read the feedback). The second is requiring students to participate in discussions on reflection questions. These questions could be similar to the reading reflection questions that we posted on each course day or incorporate current relevant topics (e.g., the p-value controversy). These discussions could happen in smaller groups in large courses or as a whole class if the enrollment is small enough. We believe that if students meaningfully reflect and discuss these questions the discussions would have a positive impact on student cognitive and teaching presence.
We also recognize that every instructor has differing constraints in resources and preferences in their teaching. So, we have reflected on a few potential aspects of the course that we believe could be compromised or done in an alternative way. First, we believe that instructor could teach the course online using most of the ideas we provided but focusing strictly on the traditional approach. We are unsure of how this would impact the community of inquiry within the course, however. Second, we realize that three course days might not be feasible for all instructors; therefore, we believe that this course could be taught with two course days per week (e.g., Tuesday and Thursday). We are not sure the impact of reducing course days on the community of inquiry but speculate this might negatively impact cognitive presence as the students are likely not interacting with the material as frequently. We would highly discourage only having one course day per week. Third, we realize that we provided individualized feedback on many explorations, usually at least one exploration per week, and this may not be feasible for other instructors. Therefore, we believe a couple of alternative approaches to addressing this concern are creating more video recaps of the explorations or decreasing the number of explorations while increasing the number of machine-graded assignments. The first alternative likely would not have much impact on the community of inquiry in the course, whereas, the second alternative may have a negative impact on the instructor teaching presence. Lastly, group work may be something to rethink. There were relatively few groups who functioned well, and many groups did not function productively. So, while we are not suggesting that group work should be eliminated, we would encourage those designing online introductory statistics courses to think of ways to encourage more productive group work.

Supplementary Materials
Appendix A: Questions from the syllabus quiz. (pdf)