How are undergraduate STEM instructors leveraging student thinking?

STEM instructors who leverage student thinking can positively influence student outcomes and build their own teaching expertise. Leveraging student thinking involves using the substance of student thinking to inform instruction. The ways in which instructors leverage student thinking in undergraduate STEM contexts, and what enables them to do so effectively, remains largely unexplored. We investigated how undergraduate STEM faculty leverage student thinking in their teaching, focusing on faculty who engage students in work during class. From analyzing interviews and video of a class lesson for eight undergraduate STEM instructors, we identified a group of instructors who exhibited high levels of leveraging student thinking (high-leveragers) and a group of instructors who exhibited low levels of leveraging student thinking (low-leveragers). High-leveragers behaved as if student thinking was central to their instruction. We saw this in how they accessed student thinking, worked to interpret it, and responded in the moment and after class. High-leveragers spent about twice as much class time getting access to detailed information about student thinking compared to low-leveragers. High-leveragers then altered instructional plans from lesson to lesson and during a lesson based on their interpretation of student thinking. Critically, high-leveragers also drew on much more extensive knowledge of student thinking, a component of pedagogical content knowledge, than did low-leveragers. High-leveragers used knowledge of student thinking to create access to more substantive student thinking, shape real-time interpretations, and inform how and when to respond. In contrast, low-leveragers accessed student thinking less frequently, interpreted student thinking superficially or not at all, and never discussed adjusting the content or problems for the following lesson. This study revealed that not all undergraduate STEM instructors who actively engage students in work during class are also leveraging student thinking. In other words, not all student-centered instruction is student-thinking-centered instruction. We discuss possible explanations for why some STEM instructors are leveraging student thinking and others are not. In order to realize the benefits of student-centered instruction for undergraduates, we may need to support undergraduate STEM instructors in learning how to learn from their teaching experiences by leveraging student thinking.

Page 2 of 20 Gehrtz et al. International Journal of STEM Education (2022) 9:18 conceptual learning and reduce achievement gaps (e.g., Eddy & Hogan, 2014;Freeman et al., 2014;Laursen et al., 2014;Theobald et al., 2020). Although replacing didactic lecture with individual and group work engages students more actively during class, these strategies do not necessarily leverage student thinking, and therefore may not maximize student outcomes. We define leveraging student thinking as using the substance of student thinking to inform instruction. For example, an instructor leverages student thinking when they create opportunities for students to share their thinking about a topic, listen carefully to make sense of what a student is thinking, and then adjust their instruction to account for what they have learned about student thinking.
Leveraging student thinking may play an important role in achieving positive student outcomes in courses that actively engage students. Teachers who leverage student thinking support the development of students' conceptual understandings (Carpenter et al., 1989), promote more equitable participation (Empson, 2003;Warren et al., 2001), and create more positive learning experiences for students (Thornton, 2006). Furthermore, these teachers tend to experience professional growth as a result of regularly leveraging student thinking. In particular, listening in order to understand students' ideas and building on these ideas during instruction can support development of specialized teaching knowledge, which then contributes to improvement of teaching practice (Franke et al., 2001;Kim, 2019). Thus far, research has examined teachers in K12 contexts. The work presented in this paper investigates how undergraduate STEM instructors leverage student thinking in their teaching, and specifically examines the thinking and behaviors of instructors who have replaced some didactic lecture with active student engagement.
Several frameworks informed the conception, enactment, and reporting of the research presented in this manuscript, including teacher noticing, teacher responsiveness, and teaching knowledge frameworks. Teacher noticing, from mathematics education, and teacher responsiveness, from science education, share two key premises: (1) the heart of teaching is action in the midst of the complex social environment of the classroom, and (2) student thinking is productive and resourceful to teaching Sherin, Jacobs, et al., 2011). Commonly, researchers studying teacher noticing distinguish between three interconnected skills: attending to student thinking, interpreting student thinking, and deciding how to respond to student thinking (Jacobs et al., 2010). van Es (2011) identified characteristics of expert teacher noticing. Namely, expert teachers tend to focus on the relationship between student thinking and teaching strategies, work to interpret student thinking, and reflect on specific instances of student thinking. In contrast, teachers with less developed noticing abilities focus more on the class environment and their general impressions, and spend more time evaluating the accuracy of student thinking than aiming to make sense of it (van Es, 2011).
Responsiveness, as a framework, focuses more narrowly on teachers' efforts to respond to student thinking. Robertson et al. (2016) emphasized that teachers who are responsive in their teaching tend to prioritize the substance of students' ideas and strive to help students connect their informal thinking to specific ideas in the discipline. Critically, responsive teachers rely on emergent student thinking to determine the direction of the activity or lesson . Responsiveness research has examined teacher discourse moves (e.g., teacher questioning) to characterize the extent to which teachers are taking up student ideas to inform the direction of the class lesson (Lineback, 2014;Pierson, 2008). As in teacher noticing work, this research suggests that teachers who demonstrate high levels of responsiveness elicit and inquire into student thinking and foreground student ideas, whereas teachers who demonstrate limited responsiveness tend to simply evaluate student thinking . Recently, research in undergraduate STEM has also turned attention to instructors' discourse moves (Kranzfelder et al., 2019(Kranzfelder et al., , 2020. Early findings highlight that most STEM instructors are not doing a lot to elicit substantive student thinking (Alkhouri et al., 2021;Kranzfelder et al., 2020). Although this recent work provides insights into instructor behavior, it has not been designed to uncover the thinking behind instructors' decisions about leveraging student thinking.
Teaching knowledge frameworks also informed our research about leveraging student thinking. Pedagogical content knowledge, which is studied in both mathematics and science education, encompasses knowledge about student thinking and learning about specific topics, and knowledge about how instructional practices and representations facilitate this learning (e.g., Ball et al., 2008;Park & Oliver, 2008). Pedagogical content knowledge is topic-specific, so teachers need distinct knowledge for every topic that they teach (Chan & Hume, 2019). Teachers use pedagogical content knowledge when planning lessons and also in real time as they make decisions while teaching (e.g., Alonzo & Kim, 2016;Gess-Newsome, 2015). Teaching knowledge, like noticing and responsiveness, has primarily been examined in K12 contexts. However, a few studies have investigated teaching knowledge at the undergraduate level. Previous research has found that undergraduate instructors draw on their Page 3 of 20 Gehrtz et al. International Journal of STEM Education (2022) 9:18 content knowledge and pedagogical content knowledge to identify, make sense of, and build on student contributions that are productive for achieving the learning goals (Andrews et al., 2019;Johnson & Larsen, 2012;Speer & Wagner, 2009;Wagner et al., 2007). Additionally, there is evidence that pedagogical knowledge, which is generalizable across topics, is important for accessing student thinking and monitoring student progress toward learning objectives (Andrews et al., 2019;. The work presented in this manuscript builds on prior work grounded in teacher noticing, teacher responsiveness, and teaching knowledge frameworks. As described above, most of this work has investigated K12 educational contexts. Undergraduate teaching and learning differs in important ways and we cannot assume that discoveries transfer directly across educational contexts. As just one example, undergraduate STEM instructors often have little or no formal education related to teaching and learning (e.g., Brownell & Tanner, 2012;Schussler et al., 2015), and may receive insufficient mentoring and feedback on their teaching as faculty (e.g., Brickman et al., 2016). However, undergraduate instructors have more years of training in the discipline and are often active scholars in the discipline. Therefore, undergraduate instructors may not deploy the same teaching knowledge and skills as K12 teachers, but may have more extensive and nuanced content knowledge. Given these differences, and others, findings established in K12 contexts must be investigated in undergraduate educational contexts. The work described in this manuscript investigates undergraduate instructors' behaviors as they leverage student thinking, and the thinking that underlies these behaviors.
We asked the following research question: how do undergraduate STEM instructors leverage student thinking in their teaching? We examined the thinking and practices of undergraduate instructors as they planned, enacted, and reflected on a lesson, with the goal of richly characterizing instances of leveraging student thinking.

Participants
We invited instructors from various STEM departments at the same research-intensive institution in the United States to participate. These instructors were recommended by colleagues as individuals who incorporated some active-learning strategies in their teaching, which we defined as a period of class when an instructor stops lecturing and students work alone or in groups. Participants were tenure-track faculty (N = 4), fixed-term faculty (N = 3), and a graduate student instructor of record (N = 1), and taught in the following STEM disciplines: Biology (N = 5), Physics (N = 1), Chemistry (N = 1), and Mathematics (N = 1). All but one participant had more than 5 years of experience teaching undergraduate STEM courses (and six participants had more than 10 years of experience). The graduate student instructor was leading the class for the first time after previously serving as a graduate teaching assistant in the same course. Three participants taught introductory STEM courses with 100-270 students, four participants taught courses with 45-75 students, and one taught a course with 19 students. These class sizes were typical for the introductory STEM courses at this institution.

Data collection: semi-structured interview and class observation
We interviewed participants before and after a class period, hereafter referred to as the target class, in order to elicit their thinking. We also filmed the target class to document instructional practices. We asked instructors to select a target class period that was typical of their instruction and included some time where students were working either individually or in small groups. The interviews focused on what would occur in the target class and what had occurred in the target class, which allowed us to hear the specific thinking of participants as they planned, enacted, and reflected on a class period, rather than more general or hypothetical thoughts about teaching. The following sections describe the specific goals of the interviews and provide details about the class observation and filming. All research was determined to be exempt by the Institutional Review Board at the University of Georgia (STUDY00006754).

Pre-instruction interview
The purpose of the pre-instruction interview was to identify participants' learning objectives for the target class, to elicit their knowledge of student thinking related to the focal topic(s) for the target class, and to gain insight into how this knowledge might inform their planning. Additionally, the pre-instruction interview provided an opportunity for participants to share specific instructional practices that they regularly used in their teaching (e.g., clicker questions) and to describe the rationale for these practices. We conducted this semi-structured 40-min interview 1 to 2 days before the target class. See the full interview protocol in the Additional file 1.

Class observation and clip selection
We video-recorded the target class, filming from the back of the classroom to capture instructor behavior and student behavior. Each participant wore a lapel microphone that captured high quality audio of the participant's voice as well as nearby student voices. J.G. used selection criteria to identify two to five video Page 4 of 20 Gehrtz et al. International Journal of STEM Education (2022) 9:18 clips from the video-recorded target class for use in the post-instruction interview. Clips ranged in length from about 30 s to slightly under 3 min, with the total time of the selected clips encompassing about four and half minutes per target class. J.G. selected clips that met at least one of the following criteria: the participant had access to information about student thinking through (a) interacting directly with students, (b) listening to students, or (c) looking at student responses to clicker questions. When narrowing down from the set of all clips that satisfied these criteria, J.G. prioritized clips where substantive student thinking was present. Specifically, she prioritized clips that included student contributions that were incorrect or incomplete, and clips that included questions posed by students that could not be answered by stating a fact or definition.

Post-instruction interview
In the post-instruction interview, we aimed to get a sense of what the instructor was thinking in real time during class as they interacted with students and made instructional decisions. We conducted the postinstruction interview within one to two days of the target class and before the subsequent class period in the course. The semi-structured interview consisted of two parts. The first part included the same questions across all participants and prompted discussion about the participants' perspectives on what happened in the target class period. The second part of the interview varied slightly across participants because it relied on video clips as stimuli. The interviewer and participant watched each video clip from the target class together and interview questions stimulated discussion about student thinking, instructor thinking, and instructor decision-making. For this portion we selected interview questions from a pre-established list that included questions like, "Can you say a little bit more about what you were thinking during this interaction?" and "Was gaining insight into this student's way of thinking useful/helpful for you? Why or why not?" Interviews were transcribed verbatim and checked for accuracy. See the full interview protocol in the Additional file 1.

Data analysis
In order to characterize how undergraduate STEM instructors leveraged student thinking in their teaching, we drew on qualitative and quantitative analysis methods. In the following sections, we describe our data analysis process and the steps we took to ensure the trustworthiness of our approach.

Qualitative content analysis of interviews
Our first aim for the qualitative analysis was to identify and characterize instructor thinking and behaviors related to student thinking from the interviews. Our qualitative analysis process was collaborative and iterative. For the first phase of thematic analysis (Braun & Clarke, 2006), we collaboratively coded interview transcripts to generate an initial codebook. We developed codes by identifying relevant segments of the transcripts, naming the code using a word or phrase that reflected the main idea in the segment, and by creating a definition that elucidated what the code was capturing. For example, a code named "designs/selects problems or tasks" captured instances when a participant described responding to information about student thinking by designing or selecting additional problems or tasks for students to work through in class or in subsequent instruction. Coded segments ranged from short sentences to entire talk turns, and we coded segments with multiple codes as necessary to capture the ideas expressed. At least two researchers coded each transcript; we met regularly to discuss coding decisions, to come to consensus about what and how segments were coded, and to create and refine codes as necessary. This is one form of constant comparison (Birks & Mills, 2011;Charmaz, 2006). Coding to consensus allowed the research team to have greater consistency in coding and allowed for a more nuanced understanding of the codes, which would not have been possible if we had prioritized inter-rater reliability. At that point, we reviewed all coded segments multiple times, providing another opportunity for constant comparison. This allowed us to separate codes when multiple ideas were captured by the same code, combine related codes, revise definitions to establish a clear delineation between codes, ensure all segments fit within the codes, and to recode segments as necessary.
The second phase of our qualitative analysis process involved identifying and grouping related codes into themes, sometimes referred to as axial coding (Charmaz, 2006;Saldaña, 2013). Themes emerged from repeated discussions among the research team. We frequently drew diagrams individually and collaboratively to understand the relationship between ideas. Comparing and synthesizing these representations revealed similarities and differences in our thinking and facilitated refinement of emergent themes. Using qualitative analysis software (MAXQDA), we also examined segments where codes co-occurred in order to better understand how codes were related and to examine relationships among themes. We presented themes and examples of coded segments to colleagues for feedback, which led to further refinement. After the research team reached consensus on descriptions for each code and theme, and on the fit of Page 5 of 20 Gehrtz et al. International Journal of STEM Education (2022) 9:18 each coded segment within the appropriate codes and themes, J.G. revisited all transcripts to ensure every relevant segment was coded using the finalized codebook, bringing segments to the research team for discussion as necessary.

Video analysis of access to student thinking in class
We conducted a systematic analysis of video-recordings of the target class to document how frequently participants had access to student thinking during class. No existing classroom observation protocols provided a fine-grained analysis of access to student thinking, so we developed a simple coding scheme aligned with our research goals. We identified all instances when the instructor had access to detailed information about student thinking, which we refer to as "high-resolution information". We coded any instances when (a) an instructor listened to students voicing their thoughts during a whole class discussion; (b) an instructor listened to a student sharing their thinking during a oneon-one or small-group interaction, or (c) an instructor had the opportunity to see or hear student thinking while students were working (e.g., eavesdropping). Interactions that included the instructor talking were split up so that each coded segment had no more than 10 consecutive seconds of instructor talk. Table 1 includes detailed descriptions of these codes. We compared these data with participants' coded interviews. Although the interviews revealed other approaches instructors used to access information about student thinking, such as attending to students' facial expressions, these instances could not be as reliably documented by an observer and thus were not part of the video-data analysis. Our goal for this analysis was to estimate how much class time provided instructors with access to information about student thinking. Researchers can code videos in MAXQDA by selecting an exact start and end time for each applied code. This produces specific estimates of the amount of class time dedicated to each coded activity, rather than the coarser time estimates produced by tools like the Classroom Observation Protocol for Undergraduate STEM (COPUS; Smith et al., 2013) that use 2-min time segments. The limited size of our dataset made this more fine-grained analysis of instructional behaviors possible. One researcher (J.G.) coded all target class video-recordings. We calculated the percentage of class time in each activity (i.e., whole class discussion, small-group interaction, eavesdropping) and a total percentage of time with access to high-resolution information (all codes together). Calculating percentages allowed us to compare results across participants who taught 50and 75-min class sessions. The number of participants is too small to meaningfully make statistical comparisons, so we report descriptive statistics (i.e., means and standard deviations), as well as a visual representation of all of these data.

Contrasting high-and low-leveragers
Systematically analyzing interviews and class video revealed variation among participants in the extent to which they leveraged student thinking during the target class period. Since the goal of this work was to characterize how instructors leveraged student thinking, contrasts between higher and lower levels of leveraging were informative. We examined our corpus of data about participants' teaching practices and thinking to determine whether and how participants could be grouped based on evidence of leveraging student thinking. We examined counts of coded segments, full transcripts, and percentage of class time spent in particular activities. Multiple ways of considering these data suggested that there were two clear groups of participants and some participants who did not fit neatly with either group. For example, some interviews received a much greater diversity of codes that captured different ways that participants accessed student thinking, whereas other interviews had fewer codes about accessing. Looking even more closely, we noted that those who had fewer approaches to accessing student thinking also tended to rely on approaches that Table 1 Codes used to tag video recording of target class

Code Description of code
Whole class discussion Instructor listened to student thinking during a whole class discussion, which is when all students could hear the interactions between students and the instructor. This code included conversations between one student and the instructor that could be heard by the class Small group interaction Instructor listened to student thinking within a small group (consisting of one or more students) during a direct interaction with the group. This code included both brief interactions when the instructor was checking on student thinking and more extended conversations. This code excludes conversations between a student and instructor that could be heard by the whole class Eavesdropping Instructors were within 6 feet of students, were not engaged in another activity, and appeared to be listening to students or looking at student written work without any direct interaction Page 6 of 20 Gehrtz et al. International Journal of STEM Education (2022) 9:18 provided less detailed information of student thinking. These differences between groups and others are the main findings of this study, and therefore we reserve additional detail for the results section. Based on our rich examination of each participant's thinking and behaviors, we concluded that three participants demonstrated similar thinking and behaviors that indicated high levels of leveraging student thinking during the target class, and two demonstrated low levels of leveraging student thinking in the target class. Hereafter, for concision, we refer to these groups as high-leveragers and low-leveragers, respectively. "High-leverager" should be interpreted as referring to a participant who engaged in high levels of leveraging student thinking in the target class period and for whom there is evidence that suggests they have the skills and knowledge to repeatedly leverage student thinking. In contrast, "low-leverager" should be interpreted as referring to a participant who engaged in low levels of leveraging student thinking during the target class period, and for whom we lack evidence that suggests that they have the skills and knowledge to engage in high leveraging repeatedly. This grouping is based on data from one class period and these instructors may behave very differently on other days of class. However, these groups acted and thought in distinct ways and shared key within-group similarities during the target classes. The remaining three participants who demonstrated intermediate levels of leveraging student thinking during the target class period varied too much from each other to be meaningfully grouped together, yet also were not similar enough to high-and low-leveragers to be grouped as such.
Considering participants within these groups lent further thematic organization to our axial codes, resulting in four overarching thematic differences between high-leveragers and low-leveragers during the target class period, which we present as the major results below. As a final stage of constant comparison, we returned to the full transcripts of each high-and low-leverager to ensure that the four overarching differences accurately, fairly, and thoroughly characterized how they did or did not leverage student thinking during the target class period. The majority of the results section contrasts the thinking and practices demonstrated by high-and low-leveragers during the target class period because these findings are most robust in our data, but we also draw on the case of one participant who showed intermediate levels of leveraging student thinking. We include these data because they further elaborate one of the overarching themes and help to address our research question. See Table 2 for participant details.

Trustworthiness of our qualitative approach
There are several attributes of our data collection and analysis that contribute to the trustworthiness of our approach. The primary goal of this study was to better understand how undergraduate STEM instructors leverage student thinking in their instruction. The study design aligns with our research goals, creating a foundation for credibility in our work (Shenton, 2004). Specifically, we grounded the data collection around one target class lesson, which allowed instructors to focus on details of their practice for a particular lesson rather than relying on general or hypothetical situations. Additionally, the video clips of the target class used in the post-instruction interview positioned instructors to re-capture their realtime thinking (McAlpine et al., 2006;Sherin, Russ, et al., 2011). Further, we designed the interview questions to elicit instructor knowledge of student thinking and their rationale for instructional decisions, prompting for connections to student thinking (Ball, 1988;Ball et al., 2008). We regularly asked follow-up questions, which provided opportunities for participants to explain what they meant and kept our assumptions and interpretations about their instructional approach to a minimum. Qualitative analysis of both the class videos and interview transcripts allowed us to triangulate data sources, shedding light on consistency between practice and the instructors' discussion of their practice.
Our research process included multiple opportunities for independent and collective reflection and sense-making, which increases credibility (Anfara et al., 2002;Shenton, 2004). We used a constant comparison approach to qualitative data analysis with multiple researchers coming to consensus about coding decisions (Birks & Mills, 2011). This ensures that coding and findings did not emerge from one individual's interpretation. Additionally, reading through and reanalyzing coded segments Page 7 of 20 Gehrtz et al. International Journal of STEM Education (2022) 9:18 promoted a code system and findings that were stable over time. Throughout our analysis, we regularly wrote analytic memos documenting our discussions and rationale for making decisions. We have also described our process in detail within this manuscript, which allows the reader to consider how we arrived at our findings and contributes to the confirmability of our work (Mays & Pope, 2000;Shenton, 2004). The goal of this work is to contribute to a deep understanding of what instructors are doing to leverage student thinking in their instruction. It is not our goal to make generalizations or apply our findings to other contexts. We used this data to generate hypotheses that could be investigated further. We also believe that our findings have inferential generalizability for instructors who would like to make student thinking more central to their own teaching (Lewis et al., 2014).
Finally, we describe our positionality to this research to highlight our reflexivity in this work (Mays & Pope, 2000). J.G. and T.C.A are STEM instructors who view accessing and responding to student thinking as central components of our instruction. We drew on our own teaching experiences to interpret data and to provide context for various instructional decisions, but were careful to not make assumptions in the data analysis. Additionally, we were able to draw on the perspective of an undergraduate research assistant (M.B.) who had been a student in two of the participants' classes and served as a peer learning assistant for one participant. M.B. was also studying to be a secondary school teacher. Therefore, she provided a student's perspective, a pre-service teacher's perspective, and insights into the classes she had experienced as we considered the data from the interviews and video-recording of the target class. Furthermore, we sought feedback from multiple colleagues who were able to share their perspectives as STEM education researchers and undergraduate STEM educators. All researchers identify as straight, able-bodied, neurotypical, White, cisgendered females, which provides a privileged and inherently limited perspective.

Results
Faculty who leveraged student thinking frequently and in a variety of ways (i.e., high-leveragers) behaved as if student thinking was central to their instruction during the target class. We saw this in how they accessed information about student thinking from multiple students, in how they worked to interpret information about student thinking, and in how they used student thinking to inform in-the-moment and future instructional decisions. We also observed that high-leveragers drew on knowledge of student thinking (i.e., a component of pedagogical content knowledge; Park & Oliver, 2008) to inform how they leveraged student thinking. We illustrate these findings by describing high-leveragers' thinking and practice as they leveraged student thinking and by contrasting this with that of the low-leveragers. We draw on both qualitative and quantitative data, and present quotes that are representative of themes, editing quotes lightly for clarity. Indented sections and sections within quotation marks are quotations from participants.
We first describe how participants accessed student thinking, and then how they interpreted and responded to student thinking. These steps were often cyclical for high-leveragers, as illustrated in Fig. 1, with responses to student thinking providing additional opportunities to access student thinking. Furthermore, although leveraging student thinking involved each of these stages, responding to student thinking could be considered most critical. Responding is the necessary action that takes information gained about student thinking and uses it to make instructional decisions that could actually impact students and their learning. Responding, as we mean it, results from a decision that is only possible to make in light of making sense of information about student thinking. Critically, instructors' knowledge of student thinking informed how they accessed, interpreted, and were able to respond to student thinking, which is depicted in Fig. 1 by the triangle at the center. This knowledge included awareness of common difficulties and misconceptions students encounter when learning a specific topic. We report how this pedagogical content knowledge underpinned and bolstered stages of leveraging student thinking at the end of the Results. We use pseudonyms for all participants, with high-and low-leveragers' names starting with H and L, respectively.

Access: high-leveragers elicited student thinking more often and in more ways than low-leveragers
Class observations showed that high-leveragers frequently accessed student thinking during the target class and interviews indicated that they did so intentionally. This section first illustrates the diversity of ways that high-leveragers accessed information about student thinking by synthesizing across high-leveragers' target classes. We next present quantitative evidence about the differences in how high-and low-leveragers accessed student thinking. We end the section with in-depth qualitative descriptions of how high-and low-leveragers accessed student thinking.
High-leveragers accessed student thinking using a variety of approaches throughout the target class period. These instructors frequently posed questions to the class in the form of clicker questions, spoken questions, and/ or questions on worksheets. High-leveragers often gave students time to work on the questions in small groups.
Page 8 of 20 Gehrtz et al. International Journal of STEM Education (2022) 9:18 While students worked, high-leveragers circulated the classroom looking at students' written work, attending to students' facial expressions, eavesdropping on students' conversations, and regularly stopping to discuss content with small groups. During small-group interactions, students asked questions or the instructor prompted students to share their thinking. Students explained how they were reasoning through the problem, where they were struggling, and areas of uncertainty. After highleveragers talked with a few groups, they returned to the front of the room. If they had presented a clicker question, they reviewed the results. As a next step, high-leveragers frequently initiated a whole class discussion in which one or more students shared their thinking. This synthesized class sequence shows that highleveragers elicited both high-and low-resolution information about student thinking during class. Highresolution information reveals the details about individual student's or group's thinking, potentially including their reasoning, problem-solving approaches, and areas of difficulty. For example, eavesdropping on students' conversations in small groups can reveal to an instructor the nature of the difficulties students encounter as they answer a question. Low-resolution information, on the other hand, lacks detail about individual students' thinking. For example, clicker questions provide low-resolution information, as does listening to the volume of chatter in the room or watching facial expressions. Low-resolution information can tell an instructor what is occurring (i.e., students have confused faces), but does not provide details about why. Importantly, low-resolution information is not necessarily less useful to instructors than high-resolution information. An instructor might strategically design multiple choice questions so that answer choices align with common non-standard student responses. The data an instructor receives from how students answer such a question is low-resolution information about what student thinking is most prevalent in the class and could inform how the instructor decides to move forward.
Systematic analysis of the videos of target class sessions focused on opportunities that participants created to access high-resolution information, and revealed that accessing high-resolution information tended to be more common among high-leveragers than low-leveragers (Fig. 2). We documented the percentage of class time that participants spent engaged in whole class discussions, small-group interactions, and eavesdropping, as well as the total percentage of time spent in these activities. Given our small sample size, we did not make statistical comparisons and instead have provided a graph and descriptive statistics. Purple text denotes features that distinguished high-leveragers from low-leverages in accessing, interpreting, and responding to student thinking, as well as the knowledge supporting these actions. Black text denotes behaviors of both high-and low-leveragers. Larger font size in "Access" denotes the most common approaches used by high-leveragers Page 9 of 20 Gehrtz et al. International Journal of STEM Education (2022) 9:18 High-leveragers, on average, spent 37% (SD = 14%) of the target class engaged in small-group interactions with students, eavesdropping, or in whole class discussions, each of which can provide access to high-resolution information about student thinking (Fig. 2). In contrast, low-leveragers only spent, on average, 14% (SD = 13%) of the target class engaging in these activities. This difference primarily results from differences in class time spent interacting with small groups about course content. High-leveragers spent considerably more class time (mean = 19%, SD = 5%) interacting with small groups compared to low-leveragers (mean < 1%, SD = 0.7%). This finding is echoed in how high-leveragers describe their own teaching. For example, Halle stated, "During class I talk to my students for pretty much the whole time, pretty much every class period. " Importantly, low-leveragers asked students to work in groups during the target class sessions and students did so. Thus, the key difference is the extent to which the instructors took advantage of the chance to interact with small groups and hear their thinking during small group work time. Highleveragers did not differ systematically from low-leveragers in the time they spent in whole-class discussions (high-leveragers mean = 10%; low-leveragers mean = 9%) or eavesdropping (high-leveragers mean = 8%; low-leveragers mean = 5%).
High-leveragers described how they valued and sought high-resolution information about student thinking. When reflecting during the interview Hen said: "I'll look at what they've written, and I might ask them to explain what they've written and then tell me about what you're doing. " This reveals that she intentionally looked at the content of student written work and sought additional insight when she asked students to explain. Helge frequently eavesdropped on student conversations, explaining in the pre-instruction interview, I'm kind of an eavesdropper, and I walk around and listen. I will listen to what they're saying to each other. If I need to intervene, I will, because sometimes they're just so off base that they're never going to figure it out, and I don't want them to keep going down that path. But a lot of times I can just stop and just ask them a pointed question and they'll kind of look at me for a minute and then start talking to each other. Fig. 2 Opportunities to access high-resolution information about student thinking created by high-and low-leveragers. Observed percentage of time in target class that the instructor had the opportunity to encounter high-resolution information about student thinking. Dots on the left for each category represent high-leveragers' observed behaviors and triangles on the right represent low-leveragers' observed behaviors Page 10 of 20 Gehrtz et al. International Journal of STEM Education (2022) 9:18 This demonstrates that Helge accessed valuable information about student thinking while circulating the classroom. In addition to learning about what students were thinking, Helge used this as an opportunity to diagnose where students were having trouble and to ask a question specifically to help students continue to make progress.
High-leveragers also capitalized on low-resolution information about student thinking during the target class. In another pre-instruction interview excerpt, Helge described how she uses students' facial expressions as an indicator that students are confused.
It's really easy [to tell by their faces if they're getting it]. You're going to get 200 people looking at you with this blank look on their face. Or … their talking back and forth gets louder because they're not getting it, and they're not actually working the problem. They're talking to each other, trying to figure it out, and they're just not getting it. … And you get to learn to read their expressions. I rely a lot on that and just looking at them and seeing, you know, are they getting it or not?
Although low-resolution information did not allow for access to the details of individual students' thinking, it was useful for high-leveragers because they could use it to draw conclusions relatively quickly about the class as a whole.
In comparison to high-leveragers, low-leveragers used fewer approaches to access student thinking and did so less frequently during the target class. Low-leveragers primarily accessed student thinking through clicker questions, students volunteering explanations, and occasionally through students' facial expressions. The two low-leveragers differed in the amount of access they had to high-resolution information, with Lou spending more than twice as much class time as Les with the opportunity to access high-resolution information in the target class (Fig. 2). Lou asked students to discuss data at tables, walked around the room as the tables discussed, and then asked each table to explain an idea from their discussion, which resulted in class time coded as both whole class discussion and eavesdropping. Lou did not talk to the working groups, nor did the post-instruction interview reveal any intentions to gather information about student thinking for the purposes of making instructional decisions during this period of the target class.
Low-leveragers recognized that insufficient access to student thinking limited their ability to respond to student thinking from class to class and semester to semester. For example, as Lou discussed in the pre-instruction interview how he decided to make changes from year to year, he reflected that he did not know what students found particularly difficult because "Nothing in my notes really addressed major difficulties that they had. … Unfortunately, I wouldn't necessarily know. They don't turn in anything. " Lou asked students clicker questions during class, but described the clicker questions as a check on whether students were paying attention. He explained that "The [clicker questions] … they're really designed to be participation points, so they're not that hard. " Therefore, the questions that Lou asked limited his access to useful information about student thinking. Les expressed a similar sentiment and also explained that clicker questions were the only way to get information about student thinking in a large class. Les's class had an enrollment of 72 students. However, high-leveragers did not limit themselves to using clicker questions to access information about student thinking in their large classes. Helge's and Hen's classes had 269 and 100 students, respectively (Table 2).

Interpret: high-leveragers tried to make sense of student thinking more often than low-leveragers
High-leveragers not only accessed, but also interpreted information about student thinking in real time, with the goal of using that information to make instructional decisions. They listened to what students were saying, worked to make sense of student thinking-which was often incomplete or incorrect-and then took action to respond based on the conclusions they drew. Halle expressed this as "just try[ing] to really… figure out what exactly they're asking and what they need. " Importantly, high-leveragers could often describe the reasoning they did about student thinking. Other times, we inferred that they had interpreted student thinking when they accessed student thinking and then made a teaching move directly related to the student thinking they had just encountered.
In small-group interactions during the target class, high-leveragers frequently asked students questions as they tried to interpret student thinking. In the following quote from the post-instruction interview, Hen systematically considered what a student might be thinking before deciding what actions she, as the instructor, needed to take to move the student's thinking forward. Hen said: I think she thought she was looking at quaternary structure, which is two different things coming together, and we were just looking at one thing. So, the fact that she was saying it looks like quaternary, I was like, 'Okay, but then what do you think quaternary means?' … If she had recognized that this was one thing and that that was the same thing, she would never have said quaternary.
Although the interaction between Hen and this student was brief, with the student only saying three short sentences, Hen was able to reason through the student's Page 11 of 20 Gehrtz et al. International Journal of STEM Education (2022) 9:18 thinking. Hen concluded that the student did not understand the model that she was looking at or the distinction between different levels of protein structures. When another student chimed in and also could not identify when a model depicted quaternary structure, Hen responded by going over the defining features of different levels of protein structure again with the whole class because the student thinking she accessed and interpreted indicated that students lacked key understanding that they needed to achieve the learning objectives for the lesson. Even more commonly than making sense of an individual student's thinking, high-leveragers made sense of student thinking by drawing on information from multiple students. For example, Halle described making sense of high-resolution information about student progress on the problem from what she was hearing from multiple student conversations. She was able to conclude that the students needed more time to engage with the problem before moving on and discussing it as a whole class. In the post-instruction interview, Halle said: I decided not to move the discussion forward as quickly as I had planned because based on the way that they were talking … I could tell that they were just starting to really put the ideas together towards the end, and I just made a decision to just let them swim in that problem for a long time instead of trying to get through content.
Further, high-leveragers used low-resolution information about student thinking to draw conclusions about where students were in their thinking and to gauge class progress toward learning objectives. For example, during the target class Helge asked students to respond to a clicker question that required a numerical answer. In the interview after class, Helge said she had calculated what the answers would be if students made common mistakes. This allowed her to identify that students who answered in a particular way had used an approach that was common, yet incorrect. She stated in the postinstruction interview:

I calculated the wrong answer beforehand. … When the results come in for the question, I look at how many people answered it wrong and there's more than one wrong way to do it. And I look at the most common wrong ways and address it.
This quote demonstrates that Helge intentionally attended to low-resolution information, could use it to gauge where the class was in their thinking, and then drew conclusions about what she needed to do next in the lesson.
In contrast, low-leveragers did not focus on making sense of student thinking as high-leveragers did. Instead, they tended to have other goals for accessing student thinking. Most commonly, low-leveragers wanted to check that students were engaged and paying attention during class. For example, in the target class period, Lou had all tables share out answers. He said, "By asking every table to report out … I think there's a little bit of a motivation to actually think about the question. And then … [they] are forced to participate. " Although here Lou had some access to student thinking, his goals for accessing did not focus on learning about student thinking to inform instruction. Other times low-leveragers seemed to skip interpreting student thinking altogether, or they only interpreted student thinking in a cursory or superficial way. For example, during the target class when Les heard two sentences of a student sharing their thinking, he quickly concluded the student's thinking was "tangential" and incorrect, and then moved on to explain the connection that he wanted students to make. Notably, Les had to consider the student's thinking in order to evaluate it and respond, but there is no evidence that he worked to make sense of the student's thinking beyond evaluating its accuracy. In the example from Hen above, she had similar access to student thinking (three short sentences), yet she reasoned through the student's thinking before identifying her next steps. Of course, any instructor may encounter student thinking that is not productive to pursue to further the learning goals. However, low-leveragers appeared to rarely make sense of and use student thinking they encountered in the target class, suggesting that either the questions they posed did not elicit productive student thinking or they could not recognize and take advantage of productive, if ill-formed, student thinking.

Respond: high-leveragers used student thinking to inform instruction more immediately and more often than low-leveragers
One of the most distinguishing features of high-leveragers was that they altered their instructional plans from lesson to lesson based on what they learned about student thinking in class. Generally, they responded by designing or selecting problems that were not originally included in their instructional plan. These problems were designed to target specific content that was proving difficult for students. For example, Helge stated the following in the pre-instruction interview when describing changes she was making to the target class based on student thinking she observed in the previous class period: They struggled as usual on the stoichiometry problems. So I've added some more just to make them Page 12 of 20 Gehrtz et al. International Journal of STEM Education (2022)  Helge observed that her students "struggled as usual" on the stoichiometry problems, indicating that she recognized that this was an area where students commonly have difficulty. Consequently, before the target class, Helge added problems to the lesson and the weekly quiz. After the target class, Helge emphasized that she was prepared to adjust the practice problems she provides in any given class period based on what each section that she teaches needs. She said:

When I teach multiple sections, it's never the same. … I might use different questions in each section, depending on the class. I usually have multiple questions asking the same thing and making them go about it in a similar way. And if they get it as a class the first time, then I skip those and go on to something else. There's no need to keep redoing it if they get it. But if I have a class that's maybe struggling a little more. Then we'll ask those other questions.
High-leveragers behaved as though their timeline for achieving instructional objectives was somewhat fluid. Their willingness to alter instructional plans from lesson to lesson suggests that they prioritized student mastery and valued student thinking, using it to inform their instructional decisions. For example, Hen described regularly modifying problems and revisiting material in the next class that she had already covered because some students had not yet achieved the learning objectives. In the post-instruction interview she said, Hen noted that many of the adjustments she makes from lesson to lesson are based on common issues that arise for students. Although Hen tried to anticipate some of these common issues, she acknowledged that every group of students is different. Consequently, she cannot always anticipate how long it will take to achieve the learning objectives.

If they don't understand it after instruction or after
In addition to adjusting examples and content covered in the following class period, high-leveragers frequently responded to student thinking in-the-moment while teaching during the target class. They responded by addressing common student questions with the whole class, adjusting the pacing in response to student progress, and facilitating small-group discussion when students were stuck on a problem. In the following excerpt, Hen describes answering a specific student's question for the whole class because she expected other students to experience similar difficulties. Hen said, If one student has a question, generally multiple students have that question. And I wanted to make sure that that question was answered before we moved on because I felt like it was pretty fundamental to understand that to be able to do the next thing. So if he's asking it, and he's pretty bold, … that means that the quiet students are having that question too.
High-leveragers regularly responded to small groups by offering something to prompt or support student reasoning. They might provide a guiding question, a prompt, or a resource that they believed would help students with a specific difficulty. In the following excerpt from the postinstruction interview, Halle highlighted that she tries to give students just enough information to help them continue working, and that her aim is to be responsive to student thinking as she interacts with groups during real time teaching. Halle said, All of the feedback that I gave them for the whole class was based off of trying to make sense of the information that I was getting from them in real time. … I walked into class with a plan, but then what I actually did really was dependent on what I saw and heard from each group as they were working on stuff. … Even with 19 students, trying to respond thoughtfully in a way that doesn't give too much information away, in a way that helps them approach a solution, and in a way that helps them feel like I am supporting them and not just trying to confuse them more, it's really hard and really complicated. It's hard to understand their questions and it's hard to keep ownership of the problem in their hands while also helping them.
Page 13 of 20 Gehrtz et al. International Journal of STEM Education (2022) 9:18 Compared to high-leveragers, low-leveragers responded to information about student thinking less frequently and in limited ways during the target class. Namely, low-leveragers tended to respond to information about student thinking by providing explanations about the content, without using or building on student ideas. When low-leveragers recognized that students did not understand material, their response might be to give an explanation immediately, adjust the pacing of a single class period, or they might, instead, adjust instruction the next time they teach the course. Although it is valuable to make changes for subsequent semesters, the students currently in a class would not benefit from these adjustments. Further, low-leveragers never discussed adjusting the content or problems for the following class period. High-leveragers, on the other hand, rarely made changes for subsequent semesters without first making changes within the current semester. For example, Hen said the following in response to the interviewer's question "Are you making any changes based on students' understanding or how they responded last year?" She said: Not terribly much because usually if they don't get it I address it right then. And then I make that change for the next class [period]. So say they don't get something from class today, I'll make that change for Friday. And so then I've already made the change so that it's ready for next year.

Knowledge: high-leveragers relied more heavily on knowledge of student thinking than low-leveragers
High-leveragers drew on knowledge of student thinking to enact every stage of leveraging student thinking. Specifically, they relied on this knowledge to access richer student thinking, shape real-time interpretations of student thinking, and inform when and how to respond to student thinking (Fig. 1). High-leveragers' knowledge was evident throughout the interviews in their discussions of common (and not so common) student thinking and difficulties with specific topics. Knowledge of student thinking, a well-described component of pedagogical content knowledge, includes awareness of common student difficulties and misconceptions about a specific topic (e.g., Park & Oliver, 2008).
High-leveragers demonstrated awareness of common student difficulties as they anticipated specific content that would be challenging for students. They designed questions and problems aimed at revealing and addressing these anticipated difficulties. For example, Helge discussed creating and posing questions that would reveal student difficulties in the pre-instruction interview, commenting that this helped students to solve problems without repeating the same mistakes. She said, I've been teaching for a long time and I know where they're going to get in trouble. So when I prepare for this lesson, I purposely try to get them in trouble to try to make them recall all this stuff. … They always have trouble with this, and … I've thought about it a lot actually, when I've been planning these class lessons about why they have the trouble.
It is important to note that carefully designed problems, like those that Helge discussed, can also provide opportunities for high-leveragers to access additional student thinking, which in turn can support further development of knowledge of student thinking.
Knowledge of student thinking aided high-leveragers in identifying what student thinking could be productive to pursue, what difficulties would be common and needed to be addressed with the whole class, and what student thinking would be less productive and therefore should be redirected. In the following excerpt, Hen highlighted her thought process as she first tried to understand a student's thinking in detail, then recognized the student's difficulty as uncommon, and responded accordingly.
She said something about a part of a molecule being charged that's not charged. I was trying to understand why she would even say that, and … I was trying to think 'What is she thinking? Like how is this even in her head?' I think that what she was looking at is like carbon is a ... big atom and hydrogen is a small atom, and so it must be uneven somehow. … I didn't ask her enough to try and understand where she was coming from … I didn't want to spend a ton of time on that because that is not a common idea. Most of the students can look and say, 'This is charged; this is not charged. ' So that's why I wouldn't have brought it to the whole class … I have to [quickly] check way off-base ideas. So that was the series of questions that I was trying to figure out, like … 'What do you see?' … I could tell that she was somehow paying attention to the wrong features, and so instead of trying to understand why she was paying attention to the wrong features, I was trying to get her to attend to the correct features. I was trying to ask questions in a way that got her on the right track rather than spending a lot of time.
This quote illustrates that Hen draws on her knowledge of student thinking to diagnose what the student needed to make progress towards achieving the learning objectives and to decide whether or not this student thinking was common enough to bring to the whole class to address.
High-leveragers also drew on their knowledge to interpret low-resolution information during class. Knowledge Page 14 of 20 Gehrtz et al. International Journal of STEM Education (2022) 9:18 of student thinking seemed to enable high-leveragers to quickly recognize indications that a student was experiencing a common difficulty that the instructor had anticipated. This interpretation happened quickly and seemingly automatically. In the following quote, Helge described that she knew and could anticipate where students would struggle with specific content. Consequently, she waited until she saw her students' looks of confusion, and then, because of her knowledge, she was able to respond by asking a question that would help direct them on the problem. Helge said, I put the question up and then I anticipate they're going to read the question; they're going to start working and then they're going to look at me really perplexed. I wait until I get the look, and then I ask them if they're stuck, and they are. And so … I will say, 'Do you remember this from [the prerequisite class] or do you remember this from earlier this semester?' And then they go, 'Oh!' and then they start working again.
In contrast to high-leveragers, low-leveragers appear to lack knowledge of student thinking. It is important to note that they, like the high-leveragers, were experienced instructors. High-and low-leveragers each had over 10 years of experience teaching undergraduate STEM courses and had taught the target course for at least three semesters. Lacking this knowledge, low-leveragers made assumptions about what students were thinking. In the following segment, Les describes creating a lesson plan that would logically flow for students by putting himself in the mind of a student since he does not have access to student thinking. Les said the following in the preinstruction interview, It's more, I think, along the lines of trying to put myself in the mind of the student than it is direct feedback that I get from most students saying that was really confusing. I have no idea, because most of them are reticent to tell you what they may be thinking. … And so I think you just have to constantly be reminding yourself of where they are in their intellectual progression, and as best you can put yourself back in the mind of a 19-year-old, second year biology student. So I do try to do that. I don't know how successfully.
This lack of knowledge of student thinking impacts Les's lesson design since he does not have knowledge of, nor much access to, student thinking. It is important to note that a faculty member with a PhD, even with the best intentions, is likely to think differently about the content than a student, and may have had different experiences learning the content when they were a student themselves. Further, this quote indicates that Les seems to be waiting for students to take action to share their thinking with him, whereas high-leveragers deliberately and regularly seek out information about student thinking.
One participant who was not a high-nor low-leverager further highlighted the importance of knowledge of student thinking for successfully leveraging student thinking. This participant was a graduate student leading a course for the first time, and so we would not expect her to have extensive knowledge of student thinking as this is often built from experience (McAlpine et al., 2006;van Driel et al., 1998;van Es, 2011). Indeed, Isa lacked this knowledge and though she accessed student thinking, she struggled to make sense of it and to respond. In short, her lack of knowledge inhibited her from truly leveraging student thinking.
Isa stood out because her approaches for accessing student thinking were similar to that of high-leveragers, yet she could not capitalize on this access. Specifically, analysis of her target class lesson indicated that Isa spent 33% of the class period in activities that could potentially give her access to high-resolution information about student thinking, which is more similar to high-leveragers than low-leveragers (Fig. 2). She accessed this information as students shared their ideas in whole class discussions (11% of class time) and as students talked with her in their small groups (22% of class time). Recall that accessing student thinking by talking to students in small groups was a hallmark of high-leveragers and essentially absent in the target classes of low-leveragers (Fig. 2). Despite this extensive access to student thinking, Isa had trouble making sense of the information she encountered. In the following quote, Isa discussed being confused by students' responses to a question she posed. Her lack of knowledge impacted her real-time interpretations of student thinking, which ultimately resulted in a response that did not leverage students' thinking. Isa said: Page 15 of 20 Gehrtz et al. International Journal of STEM Education (2022) 9:18 Further, Isa's lack of knowledge of student thinking made it challenging for her to anticipate what might be difficult for students. High-leveragers, on the other hand, were able to draw on their knowledge to anticipate and interpret student thinking. This allowed them to regularly respond to their students' thinking by making adjustments to instruction during class and for subsequent lessons, something that was very challenging for Isa. Isa often commented that what students struggled with was different than she expected. She deliberated with herself about whether or not she was sufficiently emphasizing important topics, questioned her pacing for the course, and reflected on how she might change in the future. In particular, she wanted to be more proactive in supporting student thinking and gauging student progress earlier, before summative assessments. Isa's recognition that she lacked knowledge is likely an important step in working to build knowledge.

Discussion
This research suggests that courses that actively engage students are not always centered on student thinking. In other words, student-centered instruction is not necessarily student-thinking-centered instruction. The classes that we investigated had replaced some didactic lecture time with time when students worked. If that is more broadly representative of these participants' teaching, these courses would likely be considered "active" in their university and would compare favorably to national samples of STEM courses, which tend to primarily consist of students listening to the instructor lecture (e.g., Stains et al., 2018). Yet, what actually occurred in target classes, and the extent to which participants focused on student thinking, varied considerably.
Leveraging student thinking may be uncommon within undergraduate STEM courses. A key behavioral difference between high-and low-leveragers was the amount of class time spent talking to small groups (Fig. 2). This was an important approach for accessing and interpreting student thinking for all of the high-leveragers, and was essentially absent for low-leveragers during the target class. If most STEM instructors are rarely eliciting substantive student thinking, as recent studies of teacher discourse suggest (e.g., Alkhouri et al., 2021;Kranzfelder et al., 2020), instructors are missing many opportunities to leverage student thinking. Along these lines, a study of three undergraduate biochemistry instructors observed that instructors saw value in the detail they could glean about student thinking in one-on-one and small-group interactions during office hours, but did not recognize that they could also achieve this during class time (Offerdahl & Tomanek, 2011). As these instructors adopted the use of clicker questions, they grew to appreciate the chance to learn about common student difficulties, but they did not use what they learned to alter the direction of the lesson or subsequent lessons (Offerdahl & Tomanek, 2011). Together, this scholarship suggests that leveraging student thinking is challenging for undergraduate STEM instructors and does not necessarily follow from replacing didactic lecture with time for students to work.
If we hope to support instructors in leveraging student thinking, we must first appreciate how this occurs in undergraduate STEM courses. It is informative that our observations of high-leverages align in important ways with prior observations of expert teacher noticing and responsiveness in K12 educational contexts (e.g., Robertson et al., 2016;van Es, 2011). Like skilled K12 teachers, high-leveragers tended to focus on making sense of the substance of student thinking, rather than rushing to evaluate the accuracy of student thinking. They also carefully considered the relationship between the tasks they designed and student thinking. Further, highleveragers reasoned through the thinking elicited from students and designed new problems for subsequent lessons based on observed student thinking. This allowed emergent student thinking to determine the direction of lessons, which is a key feature of expert teacher noticing and responsiveness in K12 teachers (e.g., Robertson et al., 2016;van Es, 2011).
Given the documented benefits of leveraging student thinking for K12 students and teachers (e.g., Carpenter et al., 1989;Empson, 2003;Thornton, 2006;Warren et al., 2001), and the evidence that teachers across levels struggle to leverage student thinking, further exploration is needed to understand what supports and hinders instructors in leveraging student thinking. Leveraging student thinking may depend on an instructor's knowledge, abilities, and dispositions, and there may also be important contextual factors that limit an instructor's ability to leverage student thinking. We discuss hypotheses about what contributes to leveraging student thinking in light of our findings and other relevant research.
We observed that participants' knowledge of student thinking, which is a component of pedagogical content knowledge, played a critical role in their ability to leverage student thinking. Other research across STEM disciplines has also indicated an important role for knowledge of student thinking in evidence-based teaching. Undergraduate mathematics and biology instructors rely on knowledge of student thinking to make sense of (interpret) and build on student contributions (respond), and instructors who lack this knowledge have struggled to effectively implement evidence-based instructional practices and curricula (e.g., Andrews et al., 2019;Johnson & Page 16 of 20 Gehrtz et al. International Journal of STEM Education (2022) 9:18 Larsen, 2012; Speer & Wagner, 2009;Wagner et al., 2007). Offerdahl et al. (2018) pointed to the role of knowledge of student thinking in formative assessment, which shares many similarities with leveraging student thinking, including using evidence of student understanding to monitor student progress and responding in a way that supports students in achieving the learning goals. Collectively, this research lends support to the hypothesis that knowledge of student thinking contributes to instructors' abilities to leverage student thinking in undergraduate STEM courses. Importantly, leveraging student thinking may also help to build knowledge of student thinking (e.g., Chan & Yung, 2015). Thus, a positive feedback loop may occur, whereby leveraging student thinking builds knowledge of student thinking, which facilitates better leveraging, and so on.
Other teaching knowledge may also impact whether and how instructors leverage student thinking, including knowledge about how people learn. High-leveragers seemed to think, implicitly or explicitly, that learning involves students constructing their own knowledge. Thus, to support student learning, high-leveragers aimed to be aware of and responsive to student thinking. Their knowledge about how people learn enabled them to create opportunities for students to try out their knowledge on a task or problem, realize what they knew and did not know, and work individually and with others to develop new ideas and reorganize their existing ideas. Low-leveragers, on the other hand, seemed to fundamentally think that the most effective way to learn was to hear accurate ideas. This impacted how they responded to student ideas during class. Low-leveragers tended to jump to their own correct explanation of a topic instead of building on student contributions. These findings suggest that knowledge about how people learn could be important to prioritizing student thinking and creating conditions that allow instructors to leverage student thinking. This builds on prior work that has demonstrated a role for this knowledge in active-learning instruction (e.g., Andrews et al., 2019;. Importantly, prior experience teaching course topics may be necessary but insufficient for leveraging student thinking. High-leveragers had taught the content in the target lesson for at least five years and had taught undergraduate courses for even longer (mean = 17.7 years; SD = 11.2 years). However, low-leveragers were equally experienced, so experience alone was not sufficient to result in high levels of leveraging student thinking. Nonetheless, teaching experience may have provided instructors with prior opportunities to build knowledge of student thinking and knowledge of how people learn. Teaching experience may also breed confidence and flexibility. High-leveragers described regularly adjusting their instruction from lesson to lesson and even among different sections of the same course. Further, they seemed to be able to make quick judgements and decisions while teaching. This level of flexibility is impressive and may seem impossible to instructors who are still working to build course structure, class climate, and lessons for a full semester.
Beyond knowledge and experience, leveraging student thinking may be supported by an instructor's propensity to see value in student thinking, and to be curious about student thinking, even when the thinking is incomplete or incorrect. Researchers at the K12 level have sometimes referred to a valuing of student thinking as an orientation or disposition (e.g., Stockero et al., 2020;Thornton, 2006). Recently van Es and Sherin (2021) proposed an expansion of the teaching noticing framework that notes the importance of a "stance of inquiry" in interpreting student thinking. They proposed that interpreting student thinking requires not just working to make sense of student thinking but also seeing student thinking as something worth figuring out. High-leveragers demonstrated that they valued student thinking in the way they were curious about student thinking and reflected on their students' ideas. Low-leveragers tended to correct or redirect student thinking, which might indicate that they did not value in-progress student thinking as a resource to be capitalized on during instruction. Exposing undergraduate STEM instructors to student thinking, and supporting them in identifying student thinking that can be productive to build on, may help instructors value student thinking and view it as a resource (e.g., Kazemi & Franke, 2004). This, in turn, can support their knowledge development and skills to access, interpret, and respond to student thinking. Future research might consider the role a propensity to value or be curious about student thinking plays in developing knowledge of student thinking or consider how valuing student thinking sets the stage for leveraging student thinking.
We must also consider whether contextual factors influence STEM instructors' efforts to leverage student thinking. One contextual factor is pressure to cover content, which may result in instructors feeling like they do not have the time to access, interpret, and respond to student thinking. Our results do not support the idea that content coverage and leveraging student thinking are mutually exclusive goals. High-leveragers taught introductory courses that were designed to cover considerable content. They also taught courses that were coordinated across multiple instructors to cover the same content. So how did high-leveragers balance the need to cover particular content, while also leveraging student thinking? Our study did not set out to address that question Page 17 of 20 Gehrtz et al. International Journal of STEM Education (2022) 9:18 specifically, but we have a few insights. High-leveragers focused on key concepts and seemed to indicate that they had cut extraneous content that was not central to key concepts. High-leveragers also had a vision of how topics in the course built on one another, and therefore could determine that additional time spent on one topic (because students were struggling) would result in students needing less time to learn subsequent topics. Additionally, high-leveragers' knowledge of student thinking enabled them to predict which topics were truly difficult for students to learn and which could be learned more easily. They focused in-class learning time on the difficult topics and incentivized students to learn easier topics on their own, using graded quizzes and homework. Another contextual factor that is often proposed as a barrier to student-centered instruction is large class sizes (e.g., Apkarian et al., 2021). Our study provides evidence that it is possible to leverage student thinking in large courses. Two high-leveragers taught courses with 100 or more students, whereas low-leveragers taught courses with just over 70 students. This finding is in line with other research indicating that many STEM instructors use student-centered instruction in large courses (e.g., Freeman et al., 2014;Stains et al., 2018). Overall, course size may be more of a perceived barrier than it is a logistical barrier. We recommend focusing future research about course size on determining what it takes to convince instructors that course size does not have to be a barrier and to support them in implementing evidencebased strategies effectively in large enrollment courses.
The findings of this research raise questions about what teaching professional development could help undergraduate STEM instructors develop knowledge and skills for leveraging student thinking. One promising model, Cognitively Guided Instruction (CGI), comes from the K12 context and is rooted in research on student mathematical thinking and development (Carpenter et al., 1989). Guiding principles of CGI involve teaching instructors that they can develop new knowledge by (1) focusing on student thinking through looking at student work, eliciting student thinking, and listening carefully to students; (2) striving to understand the details of the students' thinking; and (3) reconsidering their own existing knowledge in light of new knowledge about student thinking (Carpenter et al., 1989). Using this model as inspiration, teaching professional development for undergraduate STEM instructors could create opportunities for instructors to examine their own students' work in a professional learning community. This setting would allow instructors to collaboratively make sense of and discuss unclear, incomplete, or in-progress student thinking to identify evidence of student understanding (e.g., Kazemi & Franke, 2004). K12 teachers who practiced CGI guiding principles and engaged in professional learning communities tended to view students' thinking as central to their instruction, created opportunities to build on student thinking, used students' thinking to inform instruction, and continued to improve their teaching practice long after their initial professional development experience (Franke et al., 2001). As such, teaching professional development grounded in these guiding principles may be able to support undergraduate STEM instructors in developing the skills necessary to learn from their teaching and to build knowledge that could support leveraging student thinking.

Limitations
Readers should consider the limitations of this work. The sample size and scope limits generalization. We studied a small sample of faculty at just one institution. This work focused on one target lesson, which is unlikely to represent the whole of participants' teaching. Consequently, we cannot generalize to how participants taught their whole course or beyond these participants. Nonetheless, this work suggests important avenues for future research about how instructors do or do not leverage student thinking.
It is also important to recognize that our research methods do not provide access to all of the thinking that informed instructional planning and real-time decision-making. More expert practitioners, in any field, are expected to have automated behaviors that result from extensive experience (Schön, 2017;Sternberg & Horvath, 1995), and automaticity may make this expertise hard to observe and hard for experts to describe. Relatedly, teaching likely draws on tacit knowledge in addition to explicit knowledge. Explicit knowledge can be articulated and shared, but tacit knowledge is hard to articulate and may feel more like intuition and rules of thumb (Smith, 2015;Sternberg & Horvath, 1995). Tacit knowledge plays a role in expert teaching, especially in how teachers respond to events flexibly and spontaneously while teaching (e.g., Smith, 2015;Sternberg & Horvath, 1995). Research approaches that present teachers with novel and authentic teaching tasks, and record thinking and behavior in response to these tasks, may be best suited to document the deployment of tacit knowledge in real time (e.g., Alonzo & Kim, 2016). We recommend that future research considers such approaches for further exploring how undergraduate STEM instructors leverage student thinking. One strength of the work in this manuscript is that data collection was contextualized in the participants' classrooms, which is ideal for capturing the "messiness" and contextualization of teaching (Alonzo & Kim, 2016). Though our research approach