Exploring the Use of Statistics Curricula with Annotated Lesson Notes

ABSTRACT In K–12 statistics education, there is a call to integrate statistics content standards throughout a mathematics curriculum and to teach these standards from a data analytic perspective. Annotated lesson notes within a lesson plan are a freely available resource to provide teachers support when navigating potentially unfamiliar statistics content and teaching practices. We identified several types of annotated lesson notes, created two statistics lesson plans that contained various annotated lesson notes, and observed secondary mathematics teachers implement the lesson plans in their intermediate algebra courses. For this study, we qualitatively investigated how two teachers’ instructional actions compared to what was prescribed in the annotated lesson notes. We found ways in which the teachers’ instructional actions, across their differing contexts, aligned with, varied from, or adapted to the annotated lesson notes. From these results, we highlight affordances and limitations of annotated lesson notes for statistics instruction and offer recommendations for those who create statistics curricula with annotated lesson notes.


Introduction
K-12 statistics education is evolving, leaving many teachers to navigate new and unfamiliar content, practices, and experiences when teaching statistics. National and state standards, along with recommendations in the Guidelines for Assessment and Instruction in Statistics Education, have increasingly emphasized statistics content throughout a K-12 mathematics curriculum (see Franklin et al. 2007; NGA 2010; Bargagliotti et al. 2020). Further, many of these statistical concepts are intended to be introduced "from a data analytic perspective with real-world data and simulation of random processes being prime instructional vehicles" (Franklin et al. 2015, p. 29). This increased and changing presence of statistics within mathematics curricula presents challenges for K-12 mathematics teachers, many of whom have little to no statistical background and have not received adequate training to teach statistics as is currently recommended (Franklin 2013;Lazar and Franklin 2015).
To support K-12 teachers as they teach statistics, the statistics and data science education community has developed many lesson plans for teachers to use. Several statistics lesson plans are freely available online and often contain additional comments and tips, known as annotated lesson notes (ALNs) (see Morris and Hiebert 2011), that provide different types of information to support teachers' implementation of a lesson plan. Yet, despite the prevalent creation and use of lesson plans for teaching K-12 statistics, few studies have examined how ALNs can broadly support teachers in teaching mathematics (e.g., Morris 2012) or statistics (e.g., Arnold 2016;Strayer et al. 2019), and little is known about how teachers interact with and implement different types of ALNs in their statistics instruction. This study aims to address this gap in the literature by uncovering potential affordances and limitations of ALNs in statistics lesson plans.

Annotated Lesson Notes: An Analogy
To understand the term, annotated lesson note, consider cooking a new recipe and teaching a lesson for the very first time. In many ways, teaching a new lesson is analogous to cooking a new recipe. When you find a new recipe, it offers a list of ingredients and core instructions to make the food, much like when you find a lesson plan that provides a list of materials and core instructions for teaching a topic. Recipes and lesson plans are never one-size-fits-all and must adapt to meet the specific contexts and needs people and teachers face. The nuances of a recipe that often go unstated and are left for interpretation or discovery, such as what constitutes "a pinch" of salt and how long one stirs the ingredients so they are "combined, but not overmixed" often necessitate the need for trial and error until one figures out just the right combination of tweaks to successfully make the food. Like recipes, lesson plans are often shared online and can be difficult for teachers to successfully execute in their first few attempts; with the differing contexts in which teachers work, teachers' differing experiences and strengths, and students' differing backgrounds and needs, among other factors, teachers can spend considerable time adapting and refining lesson plans after each implementation, trying to find what works for their students, classrooms, and local contexts. The tweaks and tips based on others' cooking or teaching experiences are often shared among friends and colleagues and, for some recipes and lesson plans, are directly added to the recipe's or lesson plan's core instructions. These additions offer valuable support when someone is cooking a recipe or implementing a lesson plan, allowing them to learn from others' experiences. Such tips and tweaks for implementing lesson plans are referred to as annotated lesson notes; they are annotations within a lesson plan that are continuously updated to "describe all aspects of classroom instruction believed to affect students' opportunity to achieve the learning goals and that contain all the information needed to implement the [lesson] plan as described" (Morris and Hiebert 2011, p. 9). Analogous to how tips within a recipe are intended to support a person's execution of that recipe, ALNs are intended to support a teacher's implementation of a lesson plan. Arnold (2016) developed and investigated the use of ALNs in statistics curricula to support in-service secondary mathematics teachers as they used simulation-based methods to teach statistical inference in their intermediate algebra classes. She found that the ALNs, by themselves, were not sufficient in improving the teaching of statistical inference for all study participants, and the teachers implemented the statistics lesson plans with ALNs in different ways. Much like how you would expect the final product of a cookie recipe to differ between two people baking at different elevations, we expected there to be differences in how the teachers in that study implemented the statistics lesson plans with ALNs in their differing contexts. But we were left wondering how teachers' instructional actions compared to what was prescribed in ALNs of differing types, and how these comparisons differed between their classroom contexts. To explore these questions further, we situate the current study within the two most noticeably contrasting classroom contexts that arose in Arnold's (2016) research: (a) algebra classes following a schedule where students attend daily 50-min class periods and where statistics content standards are often taught as a separate add-on near the end of the school year and (b) algebra classes following a block schedule where students attend 75-min class periods on alternating days and where statistics content standards are integrated throughout the year. These contrasting contexts provide a rich opportunity to investigate different types of ALNs across different classroom structures.

Study Motivation and Purpose
The central question driving this investigation is: When secondary mathematics teachers implement a statistics lesson plan that contains different types of annotated lesson notes, how do the teachers' instructional actions compare to what was prescribed in those lesson notes? To explore this question, we selected one teacher from each of the algebra contexts described above and observed their implementation of two statistics lesson plans with ALNs. In this article, we provide evidence for how the teachers' instructional actions compared to what was prescribed in the ALNs, highlighting affordances and limitations of different types of ALNs. Knowing such affordances and limitations may help the statistics and data science education community create lesson plans with ALNs in ways that attend to the differing contexts and constraints teachers may encounter when teaching statistics.

Annotated Lesson Notes
In their 2012 paper, Hiebert and Morris document how the primary focus to improve teaching in the United States has been to recruit and retain more talented people and to improve the qualifications of teachers. Both approaches predominantly focus on improving the characteristics (or qualities) of individual teachers rather than creating instructional products that can be shared among teachers and continuously improved long after creation. They suggest an alternative approach to improve teaching-one that centers around developing, implementing, refining, and sharing lesson plans with ALNs.
A lesson plan with ALNs, defined by Hiebert and Morris (2012) as an "annotated lesson plan" (p. 94), is similar to a standard lesson plan, but is often more detailed so that "teachers can implement [it] as intended" (p. 95). Both include standard components such as learning goals and objectives, required materials, prerequisite knowledge, estimated time length, instructor moves, student moves, problem sets, and ideas for differentiation. What distinguishes an annotated lesson plan from a standard lesson plan is the addition of ALNs, which store detailed, up-to-date information and knowledge about how to teach a lesson plan, particularly for the first time (see Morris and Hiebert 2011;Hiebert and Morris 2012). For instance, ALNs can include anticipated student responses to problems, along with common student questions and conceptions; knowing this kind of information can help a teacher plan "how to use students' thinking during the lesson" (Hiebert and Morris 2012, p. 96). After repeated implementations in different classrooms, the ALNs can be updated to store new information and knowledge acquired from those implementations. Overall, ALNs are a means for a teacher to learn what others have effectively implemented before them and to adopt current teaching practices, tools, and materials in their own instruction (Cai et al. 2020).
Some statistics lesson plans already contain ALNs. The STatistics Education Web (STEW 2022), for instance, stores freely accessible peer-reviewed lesson plans; in their lesson plan template, they ask authors to include, among other things, discussion prompts and common student responses, directions for using technological tools, and optional teacher reflections on the lesson (see highlighted parts in Figure 1). These highlighted parts store additional knowledge that is useful for supporting teachers' implementations of the lesson plan and represent what Hiebert and Morris (2012) refer to as ALNs.
The information and knowledge stored in ALNs arises from a variety of different sources. Some ALNs are a result of teachers' implementations of and reflections on a lesson plan. For instance, after implementing a lesson plan, a teacher may reflect on several questions like, How can I revise the lesson plan for next time? Did the students achieve the learning goals? With what questions did students have difficulty? Teachers may even talk with colleagues or examine students' written responses to obtain more information to help them decide what they could do differently next time. In these instances, teachers have thought about how to improve their implementation of that lesson plan, and ALNs provide a place for teachers, along with others, to report and store this knowledge.
Other ALNs arise from current research on statistical knowledge for teaching (see Groth 2007Groth , 2013 and may contain subject matter knowledge, pedagogical knowledge, technological knowledge, or some combination. For example, teachers of statistics must recognize the essential roles of context and variability and understand statistics-specific topics such as formulating statistical questions, designing studies, analyzing data, and reasoning with statistical models (Wild and Pfannkuch 1999;Franklin et al. 2015). In addition, teachers often need pedagogical knowledge to teach these concepts and create learning opportunities for developing their students' statistical thinking. Teachers also need to know how to effectively use technology to help students learn abstract statistical concepts more concretely (NCTM 2018; Bargagliotti et al. 2020). Best practices stemming from this kind of research on statistical knowledge for teaching are often stored in ALNs.
Regardless of how ALNs arise, they should reflect the dynamic reality of teaching. As Groth (2015) states, it is "imperative that curriculum development not end with the production of written curriculum" (p. 14). Similarly, lesson plans with ALNs are not intended to be viewed as static and final products. Much like how notes about recipes from the past are updated to reflect current ingredients, cooking tools, or cooking techniques across different settings, ALNs are meant to be continuously updated to store current knowledge on recommended content, teaching practices, and technologies for different classroom contexts. Continuously updating ALNs to reflect this knowledge, along with the knowledge gained from teachers' implementations, is a way to help lesson plans remain current and adapt to the ever-changing nature of education.

Examples of Annotated Lesson Notes
Figure 2 presents three different sets of ALNs (written in blue italicized font) from a lesson plan on interval estimation and margin of error. These specific examples are similar, but not identical, to the ones the participants in this study were given; the statistical content and ideas within these ALNs are similar to the originals (see Arnold 2016), but for the purpose of an example, we present more concise ALNs for brevity. Figure 2 also illustrates how we formatted ALNs within a lesson plan, using color, spacing, and italics to separate them from the standard components of a lesson plan (written in standard black font).
As shown in Figure 2, the ALNs enhance the standard components of the lesson plan. For example, the first set of ALNs offers sample student responses to the question about interpreting a margin of error and highlights a common misunderstanding of the term, "error. " The second set of ALNs provides recommendations regarding group work. Lastly, the third set of ALNs suggests instructional actions to support students' conceptual understanding of a sampling distribution. Together, these three sets of ALNs convey a variety of knowledge for teaching statistics and are characteristic of some of the different types of ALNs our participants received, as delineated in the section, "Description of the Statistics Lesson Plans and Annotated Lesson Notes"

Theory of Curriculum Implementation
Teachers work under a wide variety of contextual conditions, and it is expected that their instruction will differ from one another and from the written curriculum, even when implementing the same lesson plan (Cai et al. 2018). One goal of this study is to illuminate comparisons between the written and enacted lesson plans with ALNs across different contextual conditions. As researchers, we position the teachers' choices as logical and reasonable responses to many factors, including the varied classroom, school, and community contexts in which they work; the unique needs and priorities of their students, schools, and communities; and the varying resources to which they have access. The teachers work within the constraints and incentives of a system that shapes their instructional choices. We further acknowledge that our analysis documents and discusses differences observed during the teachers' instruction of two statistics lesson plans with ALNs, which is a limited snapshot of a teacher's class. Secondary mathematics teachers' choices during a couple days of instruction are based on their broader understanding of a student's entire mathematical and statistical education and what they think is best given the many constraints faced. For example, although it might be ideal to spend time on an open-ended exploration about a topic that is also covered in another class, teachers may choose to forego that exploration to make time for discussing a different topic.
When comparing written and enacted lesson plans with ALNs, we aimed to critique the ALNs rather than the teachers' instructional choices. To make these comparisons, we adapted Morris' (2012) theory of curriculum implementation which is specifically tailored to improving ALNs. This decision was based on multiple criteria: Morris' expertise and research in the development and use of ALNs; the expectation that we would observe differences between the written and enacted lesson plans with ALNs in each teacher's instruction; and the commonality of using this theory to investigate how teachers implement a lesson plan that was created by others. Thus, for this study, we characterized a teacher's instructional actions as (a) alignments when a teacher's instructional actions were the same as what was prescribed in the ALNs; (b) variations when a teacher's instructional actions both differed from and, from the perspective of the observer, were not as effective as what was prescribed in the ALNs; and (c) adaptations when a teacher's instructional actions both differed from and, from the perspective of the observer, were as or more effective than what was prescribed in the ALNs.

Methods
In high school intermediate algebra courses in the United States, state and national content standards often include statistical concepts focusing on summarizing, representing, and interpreting data; understanding and evaluating random processes underlying statistical experiments; making inferences and justifying conclusions from surveys, experiments, and observational studies; and using simulation-based methods (e.g., NGA 2010). Thus, the target population was secondary mathematics teachers who were teaching (at the time of this study) intermediate algebra and whose students had access to an Internet-enabled computer or tablet to conduct a computer simulation. After receiving IRB approval, we provided each teacher with two statistics lesson plans containing a variety of ALNs, and we observed their implementations of the lesson plans, focusing on how their instructional actions compared to what was prescribed in the ALNs.

Description of the Statistics Lesson Plans and Annotated Lesson Notes
We adapted statistics materials from the NSF-funded CATALST Project (Garfield, delMas, and Zieffler 2008) to create two sequential lesson plans with ALNs (Table 1). Each lesson is intended to span approximately two 50-min class periods, and each contains an inquiry-and simulation-based class activity handout where students first conduct a simulation by hand using index cards and then transition to a web applet to conduct a computer simulation. In Lesson 1, students are introduced to observational studies and experiments, and use simulation to explore the concept of lurking variables and the advantages of random assignment. In Lesson 2, students simulate a bootstrap distribution to create a margin of error and confidence interval and develop an understanding of sampling variation. Both lesson plans with ALNs were developed, tested, and refined by the authors, along with other researchers and teachers (both at the high school and university level). All members con- tributing to the development of the lesson plans were familiar with the content standards and have experience teaching these topics. We note that the teachers in this study were not involved in the development of these lesson plans due to the research design of the larger study in which they participated (Arnold 2016). Instead, we gave the teachers a document describing the overview of each lesson plan, the purpose of ALNs, and how to identify the ALNs within the lesson plans. Table 2 displays our categorization of seven types of ALNs that we added to each lesson plan, together with a description and example of each. These seven types of ALNs are not meant to be mutually exclusive but are instead meant to capture the primary focus of the information and knowledge stored in each type. For example, ALNs categorized as Enhance Student Understanding and ALNs categorized as Recommendations & Reflections may store similar kinds of information and knowledge, but this categorization allows us (as researchers and authors of the ALNs) to distinguish between the former's primary focus on providing instructional actions that help support students' conceptual understanding of statistics and the latter's primary focus on sharing previous instructors' recommendations and reflections about the lesson. Further, we note that this categorization is not inclusive; rather, this list is specific to the lesson plans for this study and additional types of ALNs may exist in different lesson plans.

Participants
For this study, we were interested in the two different algebra classroom contexts discussed in the section, "Study Motivation and Purpose. " We returned to our notes from Arnold's (2016) study and purposely selected one teacher situated in each context whose implementation of the lesson plans provide a rich description of the observed ways in which their instructional actions compared to those prescribed in the ALNs. The two teachers we selected, Karen and Robin (pseudonyms), both had graduate degrees and similar experiences teaching statistics, and neither had received training for the implementation of a simulation-based introduction to statistical inference (see Table 3).

Data Collection and Analysis
Karen and Robin video-recorded their implementations of the two lesson plans with ALNs, and through these video observations we analyzed how their instructional actions compared to those prescribed in the ALNs. Each lesson was intended to span two 50-min class periods, and for Lesson 1, Karen used one 50-min class period and Robin used two 75-min class periods. Likewise, for Lesson 2, Karen used one 50-min class period and Robin used two 75-min class periods. We also invited both teachers to answer a series of questions about the lesson plans with ALNs and their implementations via a survey, semistructured interview, or both; Karen chose to answer questions via a survey, and Robin completed both a survey and interview. The questions were aimed to capture how each instructor interacted with the ALNs while preparing for and implementing the lesson plans, and they provided additional context that allowed us to triangulate what we analyzed in the video recordings. The authors used a deductive coding process (Miles, Huberman, and Saldaña 2014) to code how each teacher's observable, in the moment, instructional actions compared to those prescribed in the ALNs. The first author led the coding process and then discussed and checked for agreement with the second author and other researchers familiar with the study. Drawing from the theory described in the section, "Theory of Curriculum Implementation, " we developed the following a priori, descriptive codes: alignments, variations, and adaptations. When identifying instances where a teacher's instructional actions aligned with, varied from, or adapted to actions prescribed in the ALNs, we relied on our experiences teaching the content, our understandings of the intended enactment of the lesson plans with ALNs, and current research on best practices in teaching statistics content (e.g., Franklin et al. 2007Franklin et al. , 2015. Collectively these resources helped us limit subjectivity when coding the teachers' instructional actions. Further, we wrote memos corresponding to each observed alignment, variation, and adaptation to help us identify features of the ALNs, instruction, or classroom environment that may have contributed to those instances.
Prior to the coding process, we also distinguished between ALNs with actions that focused on the preparation of the lesson versus those that focused on the enactment of the lesson. This distinction helped us identify which ALNs we would be able to directly observe in a classroom video recording. ALNs that focused on preparation did not prescribe a specific instructional action for the teacher to implement; thus, these ALNs were not directly observable during a teacher's instruction. For instance, one Sample Student Responses ALN indicates how students may respond during the lesson which may help the teacher prepare their delivery of the material. Yet, this is not an action that is captured on a video recording during instruction. Unless the teacher specifically mentioned how these kinds of "unobservable" ALNs influenced their instruction, we were not able to code teachers' corresponding instructional actions as alignments, variations, or adaptations.
Before coding the data, we also identified ALNs with multiple parts where each part needs to be implemented, as intended, to be coded as an alignment. For example, one ALN (see the third set in Figure 2) prompts teachers to create a dot plot of the class's data and then ask the class, "How is one dot created?" The action of only creating a dot plot of the data does not represent the complete set of actions that, collectively, are intended to enhance students' conceptual understanding of sampling distributions. Thus, we would code the actions of creating a dot plot of the data without asking how one dot is created as a variation because these actions differed from those prescribed in the ALN and, by not giving students an opportunity to explain how the sampling distribution was constructed, were considered not as effective in enhancing students' conceptual understanding of a sampling distribution. Alternatively, if a teacher were to ask each student to place a sticky note with their sample mean on the whiteboard to construct the class's dot plot and then ask students to discuss, "How is one dot created?, " then those actions would be coded as an adaptation to this ALN; having students themselves, instead of their teacher, create the class's dot plot before discussing how one dot is created differs from the actions prescribed in the ALN and is as or more effective in enhancing students' conceptual understanding of a sampling distribution and, in particular, how it is created.
After we coded the teachers' instructional actions and reached a consensus, we summarized our results to visually compare the frequency of alignments, variations, and adaptations among the two teachers and across the different types of ALNs. Then, we selected rich examples from the teachers' implementations to illustrate how their instructional actions compared to what was prescribed in the ALNs. Together, this analysis allowed us to highlight the affordances and limitations of ALNs in statistics lesson plans.

Results
Through our analysis of both teachers' implementations, we found ways in which their instructional actions compared to those prescribed in the ALNs of differing types and how this differed between their classroom contexts. In the sections below, we provide evidence from their instruction to highlight these alignments, variations, and adaptations across the different types of ALNs.

Visual Comparison of Alignments, Variations, and Adaptations
Figure 3 summarizes how often the teachers' instructional actions aligned with, varied from, and adapted to actions prescribed in ALNs of different types. We note that Figure 3 only displays information about five of the seven types of ALNs because those types prescribed actions that we could directly observe during instruction; in contrast, the other two types of ALNs categorized as Sample Student Responses and Student Conceptions & Challenges did not prescribe observable actions, and neither teacher indicated they used them to prepare for the lessons. Broadly, Figure 3 illustrates how teachers' instructional actions compared within and across the different types of ALNs. We noticed a difference between ALNs categorized as Supplemental Questions and Technology Use relative to those categorized as Enhance Student Understanding, Recommendations & Reflections, and Statistical Focal Points. For  Robin's instructional actions tended to align with or adapt to those prescribed in these ALNs and Karen's tended to vary from them. These observations are particularly notable given the contrasting differences between Robin's and Karen's classroom contexts. To better understand some of the affordances and limitations of these different types of ALNs across the differing classroom contexts, we qualitatively describe specific instances of alignments, variations, and adaptations from each teacher's instruction.

Alignments with Annotated Lesson Notes
Throughout both lessons, there were several instances where we coded a teacher's instructional actions as an alignment with an ALN. Robin's instructional actions typically aligned with inquiry-oriented actions prescribed in ALNs. For instance, in her class, the students were the primary users of the web applet, and she demonstrated its features to students only after they had first attempted to use them. Each student used their own iPad to conduct a simulation and compared their answers within their groups and, at times, with the whole class. Her students also completed the class activity handouts in their small groups, first discussing their ideas with one another before having a large group discussion. As her students worked in their groups, Robin circulated the classroom helping students as needed, questioning their responses, and checking their understanding; these kinds of instructional actions aligned with those prescribed in types of ALNs such as Recommendations & Reflections.
Both lesson plans contained an Enhance Student Understanding ALN that prescribed actions a teacher could take to help their students conceptually understand sampling distributions and the process underlying a simulation. It was evident that Robin's actions aligned with what this specific ALN prescribed; during each lesson, she continually emphasized to her students the importance of knowing what "one dot" on a sampling distribution meant and understanding how the web applet was conducting a simulation. She often stated to her students, "I need you to be very clear so that you know in your mind what each dot is going to mean when you hit 100 [more samples on the web applet]. " To aid students in this understanding, she also used the class's data (from their by-hand simulation with index cards at the start of the lesson) to discuss how to build a sampling distribution, as recommended in the ALN. After creating the   The percentage of people that listen to music would be that [motioning to the value of one dot displayed in Figure 4].
dot plot of the class's data (see Figure 4), Robin prompted her students to explain what one dot meant, guiding students to explain their understanding with statistical language, as seen in the conversation presented in Table 4. Those types of statistical conversations occurred throughout both lessons in Robin's class. In Karen's class, we observed an instructional action that aligned with an ALN classified as Technology Use. The ALN highlighted a hidden "click one dot" feature on the web applet that is not readily obvious to users but allows them to click on any dot in a sampling distribution and see information about that dot, such as the sample proportion or sample mean that the dot represents. During Lesson 1, Karen and her students did not make use of this feature, as prescribed solely in an ALN. However, in Lesson 2, when the directions on the class activity handout explicitly prompted students to "click on one dot in the sampling distribution, " she and her students used this feature. Prescribing this action directly in the directions of the activity handout may have provided an accessible way for an instructor to do the actions prescribed in that ALN.

Variations from Annotated Lesson Notes
We found evidence of variations from ALNs in both Robin's and Karen's instruction, and we observed many of these variations in ALNs that appeared at the end of the lesson plans, regardless of the type of ALN. In those instances, we did not observe either teacher do the actions prescribed in the ALNs, suggesting that they may have run out of allocated class time.
We coded other instructional actions during Karen's implementation as variations from Enhance Student Understanding ALNs. In one instance, Karen's students used the "click one dot" feature of the web applet during Lesson 2, as previously described, but they were not prompted to explain or demonstrate understanding of what one dot on a sampling distribution represented. After clicking one dot, students wrote what the sample proportion was and then moved on to the next question on the activity handout. This varied from an ALN categorized as Enhance Student Understanding, which provided actions a teacher could implement to help their students understand the process of a simulation. By not discussing what one dot represented in this instance, Karen's students missed that specific opportunity to dive deeper into understanding how the web applet was conducting a simulation and generating a sampling distribution. In another instance, we observed a shift in instructional actions that we coded as a variation. At the beginning of Lesson 1, Karen's students answered questions on the class activity handout in small groups before having a whole class discussion about the answers. Shortly after, Karen shifted from a small group approach to an instructor-centered approach, leading students through the activity questions and telling them the answers to write down. In particular, a question on the activity handout asked where the center of the distribution was located, and Karen told her students, "For #7, write center near zero. " For this question on the activity, the ALN provided discussion prompts a teacher can use to encourage students to consider why this answer makes sense, what zero represents in the context of the problem, and how it relates to a lurking variable being "evened out" through random assignment. By treating the answer "zero" as a number without context, there was a missed opportunity to enhance students' understanding about some of the main advantages of random assignment.
During Lesson 1, we also observed Karen's instructional actions vary from those prescribed in a Recommendation & Reflections ALN. Karen's instructional actions, at this moment, focused on displaying a pre-made sampling distribution and instructing students to use the static figure to answer questions on the activity handout. By displaying a static figure, Karen's students missed an opportunity to personally engage in the dynamic process of using the web applet to conduct a simulation which can further aid in developing their conceptual understanding of sampling distributions.

Adaptations to Annotated Lesson Notes
During each lesson, we observed instructional actions from both teachers that we coded as adaptations to the ALNs. Generally, we observed Robin's instructional actions adapt to those prescribed in ALNs in one noticeable way: her direct edits of the activity handouts' directions. In one instance, she inserted "checking points" into the activity handout which prompted her students to stop and verify their answers with her or stop for a class discussion; these checking points corresponded to a few Statistical Focal Points ALNs and one Enhance Student Understanding ALN. Adding these checking points directly into the activity handouts' directions appeared to support the discourse between Robin and her students about how the statistical concepts they were learning in the lesson connected with statistical concepts from previous lessons and how those concepts will be important for understanding future statistical topics. Robin would remind her students that "everything we are doing in statistics is all connected, but sometimes the connection isn't obvious. " In another instance, Robin revised the directions in Lesson 1's activity handout to require students to repeat the process of shuffling index cards by hand (at the beginning of the lesson) five times rather than one. We coded this instructional action as an adaptation to a Recommendations & Reflections ALN; this action aligned with her belief and research-based evidence (e.g., Hancock and Rummerfield 2020) that students better understand computer simulations when they are preceded by hands-on simulation.
During Lesson 2, we coded one of Karen's instructional actions as an adaptation to a Recommendations & Reflections ALN. In this instance, Karen stopped to check her students' understanding about why they would find an interval estimate instead of just a point estimate and how they can use simulation to estimate sample-to-sample variability. The activity handout contained a few paragraphs highlighting this information, but as stated in an ALN, students often skim or skip reading these paragraphs. By taking some time to reiterate the importance of this information and of fully reading and understanding these kinds of paragraphs inserted throughout the activity handout, Karen's instructional actions, at this moment, appeared to support her students in understanding why they were finding a confidence interval.

Affordances and Limitations of Annotated Lesson Notes
By examining how the teachers' instructional actions compared to those prescribed in the different types of ALNs, we uncovered various affordances and limitations of ALNs for statistics instruction. Below, we discuss these affordances and limitations and provide suggestions for those who are developing statistics lesson plans with ALNs.

Knowledge Can Be Stored in Annotated Lesson Notes
The teachers' instructional actions that we coded as alignments or adaptations demonstrate that knowledge about teaching statistics can be stored in ALNs and transferred to instruction. In a post-lesson interview, Robin stated that she found the ALNs "very helpful" and described how she gained knowledge in how to convey statistical concepts, such as the meaning of one dot, during her instruction. This suggests that lesson plans with ALNs, such as those located on STEW (2022), can be beneficial to teachers. However, as evidenced by the instructional actions we coded as variations, how ALNs appear in the lesson plans also matters (see subsequent sections). One area of improvement for further research is whether it is beneficial to include knowledge about what not to do during instruction in ALNs. As written, our ALNs only suggest what a teacher can do, but they do not directly suggest what a teacher ought not to do. Instead, ALNs could explicitly state both. For example, an ALN could include a statement, such as, Avoid using a static diagram of a pre-made sampling distribution because it limits students' opportunities to develop a conceptual understanding of how sampling distributions arise and what they represent. Engage students in creating a dynamic representation of a sampling distribution.

Include Annotated Lesson Notes Directly in the Class Activity Handout
In analyzing Karen's implementation of the lesson plans, we identified areas where the lesson plans with ALNs fell short of supporting her statistics instruction. Karen's intermediate algebra curriculum did not naturally leave much time to teach the two lessons; there was only one 50-min class period, instead of two as recommended, to teach each lesson. She acknowledged before and after the study that time constraints imposed by her district's curriculum have limited what she can and cannot do in her classrooms, both regarding content and how she teaches it. Because of the constraints she faces, she values instruction that is delivered "efficiently. " After teaching from the lesson plans, she stated how she liked "how students first conducted the randomization by hand [at the beginning of each lesson], " but that it was "pretty time-consuming. " She expressed how the use of the web applet was "excellent" because it sped that process up. Because we observed Karen and her students engage with features of the web applet (like "click one dot") when directed to do so in the activity handout's directions, but not when those features were described solely in the ALNs, we suggest that any essential ALNs, such as those that support students' concep-tual understanding, be also embedded directly in the activity handouts. For example, in an ALN, we strongly emphasized why it is important to ask students what one dot from their sampling distribution represented. We could revise our class activity handouts to not only include this "What is one dot?" question, but to also include questions such as "How does this computer simulation relate to the simulation you did by hand at the beginning of the activity?" and "Why do you think it is important to understand how one dot was created?" Questions on the class activity handouts can also be created in ways that highlight essential information from multiple types of ALNs, such as those categorized as Student Conceptions & Challenges, Sample Student Responses, and Enhance Student Understanding: "Tonya thinks that one dot in the sampling distribution represents a single observation instead of the mean of 30 simulated values. Explain why Tonya might think that. Describe how you can help Tonya understand what one dot represents. " This technique of embedding information contained within ALNs directly in the directions or questions on the class activity handouts may help teachers address these specific (and valuable) ALNs as intended.

Find a Balance Between Details and Prep Time When Writing Annotated Lesson Notes
One of the limitations of a lesson plan with ALNs is that it can be time-consuming for a teacher to read through, understand, and prepare for instruction. Robin indicated that she invested a lot of time preparing to teach the two lesson plans with ALNs. She described using a process that involved reading and questioning the ALNs, working through the class activity handouts herself, and thinking about how to best implement each lesson plan and how to incorporate the information contained in the ALNs into her own classroom setting, stating [The lesson plans with ALNs were] not something that I felt like I could pick up, read through it, and say, 'okay I am ready to go. ' I did have to read a couple of times to get things in my head. I really had to work the [class activity] problems, think about, you know, how would the kids interact with this and do all of those things. This is a process a teacher has to go through if you are going to do a good job anyways.
Robin thought about the content, engaged with the activity handouts as a learner, and interpreted the ALNs, integrating them with her own instructional methods. Her interactions with the ALNs, from the dual perspective of a learner and a teacher, meant that she invested considerable time into the process of understanding and implementing the lesson plans with ALNs. We do not know the extent to which Karen interacted with the ALNs because she stated that she read through the lesson plans but did not elaborate beyond that.
Just as statistical understanding ought to be nurtured throughout a student's education, knowledge for teaching statistics can be nurtured throughout many lesson plans with ALNs. Instead of incorporating many different types of ALNs in a single lesson plan, it may have been beneficial to limit the number and types of ALNs to focus on a few key concepts. For example, in Lesson 1, we could have only added ALNs that support the development of conceptual understanding (i.e., Enhance Student Understanding ALNs). Then, in Lesson 2, we could add ALNs that focused on making connections between statistical concepts (i.e., Statistical Focal Points ALNs). By including fewer ALNs and varying the type(s) embedded in different lesson plans, ALNs could become more manageable for a teacher to use as intended.

The Evolution of a Statistics Lesson Plan with Annotated Lesson Notes
Groth (2015) states that it is "imperative that curriculum development not end with the production of written curriculum" (p. 14). In this study, both teachers did instructional actions that we coded as adaptations to those prescribed in the ALNs, which introduces the question of how to incorporate these adaptations into the ALNs. Finding a way to capture and monitor these improvements and additions is central to building and storing a knowledge base for teaching statistics. Developing a dynamic system that enables teachers to update and revise ALNs is one possible approach worth further exploration. To our knowledge, there currently is not a dynamic system in place to capture teachers' and researchers' expertise in updating ALNs. The use of hyperlinks is one feature of a dynamic process that might be particularly useful for storing revisions of ALNs within a lesson plan. A hyperlink to a particular ALN could be used to view multiple revisions of that ALN. This could also allow teachers to read through specific ALNs and different revisions to find instructional actions that might work well given their specific classroom contexts and constraints. By storing the revisions in a hyperlink, teachers can explore what has (or has not) worked well in the past and adjust accordingly.
We find our categorization of different types of ALNs (Table 2) to be useful for organizing and communicating the kinds of information different types of ALNs offer about statistics instruction. By using descriptive categories such as these, teachers and researchers can more easily convey the primary types of information each type of ALN provides. Such categorizations may also make it more feasible to support and sustain a dynamic system for updating ALNs within a lesson plan, because the expertise others would share to help update ALNs could be submitted and organized under the relevant category(ies). For example, a secondary mathematics teacher could submit an ALN on how to maximize class time by building connections between the statistics and mathematics content standards addressed in a lesson plan, categorized as Statistical Focal Points (or perhaps a new type of ALNs categorized as Connections Between Mathematics & Statistics), as well as contribute to Sample Student Responses that are not already listed. While challenging, creating an easily accessible and dynamic system that has some of these features for updating and revising ALNs would allow multiple members of the statistics and data science education community to continually contribute new knowledge, tips, and tweaks for statistics lesson plans.

Conclusion
The evolving nature of K-12 statistics and data science education presents an urgent need to support secondary mathematics teachers as they learn the complexities of teaching statistical concepts in potentially unfamiliar ways. For the statistics and data science education community to best support teachers, it is important to continue researching the development and use of lesson plans created for teachers. Our study suggests that lesson plans with ALNs can be useful for statistics instruction, but additional research is needed to fully understand how they can be improved to best support teachers across multiple classroom contexts and constraints. One constraint corresponds with the tension many secondary teachers face between having limited time to teach everything in their curriculum and using nationally recommended, but time-intensive, curricula and methods. Teachers and researchers should continue to examine different ways to design, write, revise, and update statistics lesson plans with ALNs that directly address teachers' differing classroom contexts and constraints to support K-12 statistics instruction.