Building the expert teacher prototype: A metasummary of teacher expertise studies in primary and secondary education

While expert teachers remain a frequent focus of research in education, to date there have been very few attempts to conduct systematic reviews of this literature. This paper presents the findings of the first systematic metasummary of research on teacher expertise in K12 education (primary/elementary and secondary levels), based on analysis of 106 empirical studies from 16 countries involving 1124 teachers identified as experts. The inductively-developed coding framework was applied independently by both authors to the dataset to generate agreement counts for specific coding themes, firstly for specific domains of teacher expertise, and then stratified to compare primary and secondary studies. We present 73 specific features organised into six domains in our expert teacher prototype. Salient findings indicate that, with regard to professional practice, expert teachers reflect extensively and often critically on their practice, help their colleagues frequently, and are continuous learners throughout their careers. Concerning knowledge, we find that expert teachers have well-developed pedagogical content knowledge and knowledge about their learners. In the domain of pedagogic practice, we observe that expert teachers display flexibility in the classroom, build strong interpersonal relationships with their learners, whom they engage through their choice of activities and content, and frequently make use of strategies typically emphasised in both constructivist and learner-centred education literatures. We offer our prototype as a useful initial sketch of family resemblance among expert teachers rather than a checklist of necessary or expected features of expertise, also cautioning that the prototype remains far from complete.


Introduction
The study of teacher expertise has shown itself to be both a popular and a complex subject of educational research over the last 40 years. It has proven to be more challenging to study than many other types of expertise, due, in part, to its diverse manifestations (Berliner, 2004) and, in part, to the complex social contexts (schools and classrooms) in which it develops and is typically studied (Stigler & Miller, 2018). As such, current understandings of teacher expertise are largely limited to what can be gleaned from non-systematic reviews of prior work in certain areas of this field (e.g., Berliner, 2001;Schempp et al., 2002), and from reading the many individual studies themselves. Despite the existence of hundreds of such research reports of teachers characterised as experts, there are, to our knowledge, no published systematic reviews of this ever-expanding literature for K12 (primary and secondary) community (see Anderson, 2021;Bucci, 2003). Such authors have drawn on more socioculturally embedded approaches to sampling participants, for example, by asking key stakeholders in the local community (e.g., school inspectors, headteachers or teacher educators) to nominate potential participants (e.g., Waynik, 2013;Wolff et al., 2015), or by selecting those teachers who also occupy leadership roles (e.g., mentors, teacher educators) as study participants (e.g., Stahnke & Blömeke, 2021b;Swanson et al., 1990); some of these studies have adopted two or more such criteria (e.g., Anderson, 2021;Solmon & Lee, 1991).
This wide variation in sampling strategies is discussed critically by Palmer et al. (2005), and constitutes a potential validity threat to the teacher expertise literature: How is it that something so familiar as 'expertise' can be so difficult to define and identify consistently in education? It is accompanied by a parallel methodological challenge to the researcher interested in reviewing these studies: How can one bring together and summarise the findings of this diverse body of research systematically? It is perhaps due to this dual challenge that, to our knowledge, almost no prior attempts have been made to review this literature systematically, although Van Dijk et al.'s (2020) framework for expertise in higher education is a notable exception at tertiary level. To our knowledge, no published systematic reviews (meta-analyses, meta-syntheses or metasummaries) of primary and secondary teacher expertise studies exist. Nevertheless, if conducted appropriately, such an endeavour has the ability to offer an empirically-based realisation of Sternberg and Horvath's (1995) expert teacher prototype and also to explore the extent to which the identified features of this prototype appear to be similar at primary and secondary levels.
As our aim is to build this expert teacher prototype inductively, we avoid adopting a strict a priori definition of expertise. Like Ericsson (2018), we recognise the intuitiveness of "expert" and "expertise" as familiar, valued, everyday constructs that are nonetheless applied widely and in various ways in all domains of social practice, including education. As such, our review includes all qualifying studies in which some or all teacher-participants are characterised as experts as potentially able to contribute to the prototype. In this sense, our conceptualisation of teacher expertise is the prototype itself, offered below. This approach reduces the danger that any presumptive definition of expertise might bias the study findings through the initial application of overly-stringent inclusion criteria (see Glass, 2000), and is consistent with the ethos underpinning metasummary, our chosen methodology.

Aim and research questions
The primary aim of this study is to collate and present the most frequently reported findings of teacher expertise studies with regard to the cognition (including knowledge, beliefs and cognitive processes), personal attributes and practices (pedagogic and professional) of expert teachers of any subject at primary and secondary levels (i.e., K12 education) around the world. The research questions are formulated as follows: 1. What are the most commonly reported findings of studies of expert teachers concerning their cognition, personal attributes and practices? 2. To what extent do these findings vary between primary and secondary contexts?

Methodological framework
This study approaches its research questions through the application of Sandelowski and Barroso's metasummary, largely as described by , with any divergences from their procedure-all taken to strengthen methodological validity and reliability-described and justified below, appropriate to the phenomenon under investigation (teacher expertise) and the research literature involved. As far as is compatible with the metasummary approach, we also follow contemporary guidelines on conducting systematic reviews to ensure appropriate rigour and replicability throughout (Maeda et al., 2022;Page et al., 2021).
While meta-analysis is the approach typically used to bring together the findings of different quantitative studies, and metasynthesis, in its diverse forms (see Maeda et al., 2022;Thorne et al., 2004), as a means to do similarly for qualitative research, Sandelowski and Barroso's "metasummary" or "mixed research synthesis" (2007;  enables the researcher to aggregate findings from diverse studies across the quantitative-qualitative paradigm divide with a degree of systematicity and rigour that enables tentative, contingent, potentially replicable generalisations to be made (Denny, 2018;. Metasummaries typically begin by identifying and including all potentially relevant studies investigating a phenomenon, regardless of quality, rather than setting a priori quality criteria . By doing so, they avoid the danger of researcher bias pre-filtering the studies included (see Glass, 2000). The findings of all these studies are then identified and abstracted to a degree whereby they can be compared, aggregated and counted quantitatively. Typically, assessment of quality occurs at this later stage, when the use of both frequency and intensity effect sizes are used to identify which findings are likely to be sufficiently robust for reporting ; this second stage is modified here through the addition of two criteria (an independent agreement criterion and a threshold criterion) appropriate to the complex and diversely theorised construct of teacher expertise under investigation.

Literature search process
Our literature search began with the identification of appropriate search criteria, including "topical (what), population (who), temporal (when), and methodological (how) parameters" (Sandelowski & Barroso, 2007, p. 35). As such, we identified studies of expertise among primary and secondary teachers conducted at any time using qualitative, quantitative and mixed methods designs as suitable (i.e., the widest possible parameters for our area of interest). We made use of four databases (ERIC, Proquest, Web of Science, Google Scholar), aiming to identify indexed empirical studies in which teacher participants are described as 'experts', or as having 'expertise', either in the title or abstract of the paper. 2 Two Boolean searches were conducted in each database, as follows: 1. "High precision searches" (Sandelowski & Barroso, 2007, p. 35) for the two most frequently used terms in such studies: "expert teacher(s)" and "teacher expertise"; 2. "High-recall searches" (p. 35) for a wider range of terms, always including "expert*" but allowing for truncation, intermediate terms (e.g., "expert maths teacher") and common alternatives to teacher (e.g., "practitioner", "teaching" or "pedagogical" expertise, "educator", etc.).
Two examples of search syntax (as used in the ERIC database) are provided in Fig. 1. These searches yielded 3207 and 6991 works respectively, including published and unpublished papers, reports, books, chapters and articles, as well as PhD and MA theses (hereafter "works"). The removal of duplicates left 5323 works, which were then screened (Stage 1 screening) for relevant empirical studies. Most rejected at this stage were non-empirical works (e.g., opinion pieces, practical guides) or studies that did not involve teachers characterised as experts (e.g., those studying the development of "teacher expertise" as professional competence). This left 551 works for Stage 2 screening.
Stage 2 involved screening the 551 remaining works for pre-defined basic inclusion and exclusion criteria. In keeping with Sandelowski et al. 's recommendations (2007, p. 2), "no report was excluded for reasons of quality" and inclusion criteria were kept as simple as possible to allow the maximum number of empirical studies that could reasonably be argued to involve expert teachers to be included. To be eligible, a work needed to include description of the findings of one or more original empirical studies investigating aspects of one or more "expert" teachers' cognition or practice in primary or secondary school settings. Consistent with consensus in the teacher expertise literature, two broad criteria for participants to qualify as experts were set: 1. Sufficient experience to allow for expertise to develop (nominally set at 5 years; Berliner, 2004;Palmer et al., 2005;Tsui, 2005). 2. Some attempt beyond either experience or basic qualified teacher status (QTS) to justify characterising participants as experts. Here we drew primarily on Palmer et al.'s (2005, p. 17) description of appropriate markers, including social recognition (e.g., nomination by a key stakeholder), performance criteria (e.g., learner achievement, receipt of teacher awards) and professional/social group membership (e.g., teacher educator status, advanced certification).
Works that involved teacher participants nominated for their expertise only in a subcategory of pedagogic practice (e.g., shared reading, ICT) were also excluded, as were those that provided overly vague descriptions of criteria application (e.g., "X came highly recommended", as opposed to confirmation that nomination was used). The majority of works rejected at this stage (n = 443) involved teachers characterised as experts despite having only experience and/or QTS; smaller numbers of theoretical and unsystematic review pieces were also rejected. A final search involved both forward and back-checks of citations in those works that passed Stage 2 screening, yielding an additional 13 titles of relevance.
Thus, a total of 121 works of relevance were identified during the literature search. Any of these found to be reporting on the same study, dataset or expert teachers (e.g., a PhD dissertation and subsequent journal article on this) were lumped together to avoid duplication, with any relevant findings being counted only once. This yielded a total of 106 "study-sets" for the coding stage. Hereafter, following , we will use the term "report" to refer to each of these study-sets, because "in research synthesis projects, you are not reviewing studies per se, but rather the reports of those studies" (Sandelowski & Barroso, 2007, p. xvii).

Theme generation and study coding
The coding process involved two stages: development of coding themes and the coding of the studies themselves. In the first of these stages, the first author began by reading through a large sample of the qualifying study reports alongside the wider literature on expertise (e.g., Bereiter & Scardamalia, 1993;Berliner, 2004;Ericsson, 2018) to build sufficient theoretical understanding of the construct. Qualitative data analysis software (MAXQDA) was used to generate codes iteratively during this process, leading to categorisation in a number of "domains" (knowledge, cognition, beliefs, pedagogic practice, attributes, professional practice). Each code was summarised as a short descriptive "theme". Any themes that were found to be broadly analogous or overlapping were merged and theme descriptors were amended to retain clarity. For example, a number of reports described expert teachers who had beliefs in "setting high challenges" or "having high expectations" of learners; these were grouped under the theme "Having high expectations/setting high challenges for learners". This process continued until a point of saturation (after 76 reports), when the reading of additional reports did not add significant new themes to the coding list. This led to a long, but manageable coding list of 180 themes. The two authors then met to discuss and ensure understandings of the themes, and the constructs behind them (e.g., 'adaptive expertise', 'pedagogical content knowledge') were shared, although reference to specific papers was avoided to ensure the second author's coding was not influenced during this process. The second author then attempted to use this coding list to code a large sample of the reports (n = 66) to assess whether any additional codes were required. These were then discussed and two additional themes were proposed, which were added to the coding list to make 182 in total. Once more, we avoided discussing specific studies during this process so as not to influence the main coding stage.
In the second coding stage both authors read through all 106 study reports independently to identify their 'findings', defined here as descriptive statements about any aspect of a participant expert teacher's cognition, attributes or practice. Each author assigned a given theme to a report when they considered that the report authors presented a finding that agreed with this theme. As  recommend, we took care not to code authors' discussion of prior studies (e.g., in the literature review), coding only those findings that were presented as original to the empirical study in question. These were presented most often in the "Findings" or "Results" section of reports, but also occasionally in "Discussion" or "Conclusion" sections; if so, these were also coded, providing they were clearly separable from authors' discussion of prior research.
Our independently coded matrices were then brought together. At this stage, our procedure differed from Sandelowski and Barroso's (2007) for a combination of reasons. They recommend comparing, discussing and attempting to agree on any differences of opinion (what they call "negotiated consensus"; 2007, p. 230). However, we chose to adopt an alternative, more objective procedure that took advantage of the larger number of reports in our sample while retaining a manageable workload. As our primary aim was to identify the most commonly reported findings (rather than an exhaustive list of these), instead of attempting to agree on areas of difference (when raters may be biased towards agreement), we left our original, independent codings unchanged, and a theme was only counted for a specific report if both authors had independently assigned it to the report during the main coding stage (i.e., an independent agreement criterion).
As the challenge of identifying evidence for a large number of themes within what were sometimes lengthy texts was considerable, we anticipated a fairly low level of inter-rater reliability. Although overall observation agreement was high (90.5%), because each coder only coded a small number of the total themes for each report (a mean of 17), interrater reliability analysis using Cohen's kappa coefficient was required (Cohen, 1960;Hallgren, 2012). As suspected, this indicated only "moderate agreement" (Landis & Koch, 1977): k = 0.440 (95% CI, 0.418 to 0.461), p. < 0.001 (see Table 1). As a result, we chose to adopt an additional measure of reliability, consistent with our main aim of identifying "the most commonly reported findings" of the studies that, nonetheless, also served to reduce the likelihood of false positives creeping into our presented findings. Because the themes were being assessed across multiple reports, we were able to generate an agreement count for each themehow often we agreed on a theme across the total number of reports. In the main study findings below, we report primarily on those themes where the agreement count exceeded a threshold value of five (i.e., on at least five occasions, the two coders agreed that a theme was present in a particular study; our threshold criterion), constituting 73 themes in total as the most commonly reported findings (our choice of this threshold value is discussed critically in 7.1 below). These agreement counts are also presented (in Table 3, below), and serve as tentative indicators of frequency for each finding, which we prefer as more cautious alternatives (given our challenge) to the "manifest frequency effect sizes" that Sandelowski and Barroso (2007, p. 160) propose. In order to answer the second research question, our dataset was then stratified to separate and compare the findings of studies conducted at primary/elementary level (n = 31) with those at secondary level (n = 66). 3 Because sample sizes were smaller-and differed between the two groups-we have used standard competition ranking (Liang et al., 2020) of agreement counts to display and compare the rank order of specific themes between primary and secondary levels.

Findings and initial discussion
After the initial presentation of descriptive statistics we present and discuss the findings under the following domains: cognition (divided into knowledge base, cognitive processes and beliefs), personal attributes, professionalism and pedagogic practice of expert teachers, all drawing primarily on data presented in Table 3. This is followed by presentation and discussion of the stratified findings. The agreement count (AC) for high-scoring themes is indicated in brackets. Table 2 presents the descriptive statistics for the review. The 106 reports span 39 years  and involve a total of 1124 expert teachers (hereafter ETs). The majority (78%) were categorised as qualitative (including 45 case studies), and a majority (62%) were conducted at secondary level. Twenty-three of the reports describe studies involving teachers of varied subjects and 10 involve Note. Percentages are rounded to the nearest whole number. 3 The remainder (n = 9) included both primary and secondary levels. High awareness of what's happening in class 12 1 1 Extensive and automated cognitive processes/heuristics (teaching or planning) 8 2 = 4 = Primary concern with student learning/on-task behaviour 6 4 = 2 = Able to make informed decisions in class 6 4 = 4 = Regularly engages in progressive/experimental problem solving 6 -2 = Able to predict potential problems 5 2 = 6 = Beliefs (8 out of 32 themes)

Descriptive statistics
Relationships/rapport as important 9 4 = 1 = Treating Ls as individuals with diverse needs & backgrounds 7 2 = 3 = Belief in constructivism (or aspects of, esp. non tabula rasa, Ls construct own knowledge) 7 1 6 = A sense of moral duty or mission towards Ls 6 4 = 3 = Engaging Ls as important 6 -1 = Facilitating development of Ls as human beings/future citizens (social responsibility) 5 2 = 10 = Having high expectations/setting high challenges for Ls 5 -3 = Accepting primary responsibility for learning 5 -10 = Personal attributes (5 out of 13 themes) Passion for profession/work as teacher 12 4 = 1 = Care for/love their learners 12 non-subject specialist teachers (generalists) at primary level. The remainder report on teachers of 12 different subjects, with studies of maths teachers (n = 25) being the largest subject-specific group. While there was a majority of studies conducted in the USA (59%), the sample includes studies from a total of 16 national contexts, including in Asia (n = 24), Europe (n = 15) and Australasia (n = 3); no studies from African or Southern American contexts met inclusion criteria.

Findings on the cognition (knowledge base, cognitive processes and beliefs) of expert teachers
Expert teachers are frequently found to have well-developed knowledge in a number of areas. First among these is their pedagogical content knowledge (PCK) (Shulman, 1987), which, while definitions vary (see Van Driel & Berry, 2010), appears to be a core feature of ET cognition (AC = 17), and links to compelling evidence in the dataset for both extensive knowledge of their subject, and an extensive, integrated knowledge base underpinning both subject knowledge and PCK. ETs are also frequently found to have extensive knowledge about their learners (AC = 16), both in general (characteristics of typical learner types in their context) and with regard to specific individuals (e.g., knowledge of their personal needs and challenges) as well as their curriculum. It is likely that these two factors are key to understanding how and why teacher expertise is often discussed as being context-specific (see, e.g., Berliner, 2001;Hattie, 2003).
Research on expert teachers' cognitive processes indicates strongly that they have a high awareness of what is happening in the classroom (e.g., learner behaviour, progress and need for support) (AC = 12), linked closely to a primary concern with students' being on-task and learning (e.g., Wolff et al., 2015). Likely informed by their extensive knowledge of learners, ETs are often able to predict potential problems and intervene proactively to prevent them. Because they have an extensive range of automated cognitive processes and heuristics (e.g., specific ways to respond to occurrences or manage lesson stages), they are able to deal with the unexpected effectively and make informed decisions as a result. Finally in this area, ETs are often found to engage in what Bereiter and Scardamalia (1993) call "progressive problem solving": learning from their experimentation/improvisation when confronted with the unexpected, consistent with Schön's theory of reflection-in-action (1983).
With regard to expert teachers' beliefs, the most salient of many that are reported relate to issues of interpersonal practice, including beliefs in the importance of building good relationships with their learners (AC = 9), in engaging them throughout the lesson, and in treating each learner as an individual, aware of their diverse needs and backgrounds (again linking back to teacher knowledge of learners) (e.g., Rollett, 2001). Likely underpinning these interpersonal beliefs were two more ideological beliefs: in their sense of moral duty or mission towards their learners (consistent with Korthagen's 'good teacher' framework;, and in the need to facilitate the development of learners as human beings and future citizens, for whom they were also often found to accept primary responsibility for learning (e.g., Waynik, 2013). Two more pedagogically-oriented beliefs, also salient in the dataset, were in constructivism as a theory of education (or aspects of it, such as the non-tabula rasa principle, or the need for students to build their own understanding of a subject) (e.g., Traianou, 2006), and evidence of ETs either setting high standards for their learners, or having high expectations of their ability (e.g., Milstein, 2015), both of which also receive support from the teacher effectiveness literature (see, respectively, Staub & Stern, 2002;Wang et al., 2018). A further seven beliefs that narrowly missed out on threshold value (AC = 4) also related to issues of interpersonal practice (e.g., respecting learners, building a strong, stable learning community) and associated values (e.g., avoiding blaming learners for their own shortcomings, and building learner self-esteem), all pointing towards very clear evidence for the interpersonal dimension and a moral imperative at the core of expert teacher belief systems.

Personal attributes of expert teachers
Three personal attributes of expert teachers stood out in the dataset (all AC = 12): their passion for their profession and work as a teacher, a positive sense of self (varied constructs including "self-image", "self-confidence" and "self-efficacy" were combined here), and evidence that ETs care for, or love, their learners (e.g., Hanusova et al., 2013); this latter observation is also frequently reported in the teacher effectiveness literature (e.g., Stronge, 2007). In addition, two professionally related attributes found in the dataset were ETs' strong desire to succeed (reports of "ambition" and "motivation" were subsumed into this category) and their resilience/persistence in the face of challenges (e.g., Campbell, 1991), directly comparable to 'self-regulation' in certain teacher competency frameworks (see Klusmann, 2013); both of these are likely to influence and be influenced by their passion for their work, discussed above. Just below the threshold value (AC = 4) were several related features: passion for the specific subject they teach, enjoyment of the act of teaching and an optimistic world view. These findings on personal attributes are largely consistent with those of Bardach et al.'s (2022) metareview of teacher psychological characteristics, which found teacher self-efficacy, enthusiasm and certain personality factors (e.g. extraversion) to exhibit generally positive associations with both teacher well-being and learner outcomes.

Professionalism of expert teachers
Expert teacher professionalism is characterised by three broad, interconnected and very well supported themes: reflection, learning and collaboration. 4 Concerning the first of these, ETs were frequently found to reflect extensively (AC = 21) and critically (including, for example, their willingness to problematise or question their practice) (AC = 13), with several reports that they also challenged themselves through progressive problem solving (e.g., Tsui, 2003). Regarding expert teacher learning, there was strong evidence in the dataset that they have a continuous desire to improve and learn throughout their careers (AC = 16), often through proactive engagement in their own continuing professional development and in-service qualifications, and often through collaboration with peers (e.g., Patterson, 2014). This interest in collaboration manifested itself both directly and indirectly. Direct manifestation included, for example, their participation in professional learning communities. Indirect manifestation included frequent evidence of ETs helping colleagues (AC = 16) and offering support of a range of types, such as mentoring and informal peer support alongside formalised teacher educator and leadership positions (e.g., Goodwyn, 2011). They were also found regularly to share resources and/or ideas with colleagues. A final theme of ET professionalism noted in the dataset was evidence that ETs are dedicated, hardworking practitioners (e.g., Ortogero et al., 2017), potentially due to the personal characteristics discussed abovea passion for their work and a strong desire to succeed.

Pedagogic practices of expert teachers
By far the largest category in our dataset concerned expert teachers' pedagogic practices, including both pre-active (i.e., planning) and interactive (in class) aspects of pedagogy. Thirty-nine of 89 themes in this domain reached the threshold value.
Evidence on the planning practices of ETs indicates, firstly, that they are careful planners, although they may not necessarily plan in written form (for some, planning may be wholly mental; e.g., Borko & Livingston, 1989), and secondly, that they consider learners' needs when planning (AC = 12), yet without overlooking long-term (e.g., curricular) objectives. However, paradoxically, ETs are also frequently reported as flexible and contingent planners, aware that they may need to change tack in class if required (e.g., Yang, 2014;see below). With regard to materials preparation, two themes emerged: that they regularly adapt core curriculum (e.g., textbook) materials, and that they supplement these with their own materials or activity types.
With regard to the structuring of lessons, one of the most frequently reported themes in the dataset reveals that expert teachers are-despite their careful planning-able to display flexibility during the lesson, improvising and responding appropriately to the learning as it happens (i.e., what is frequently referred to as "adaptive expertise" in the literature; e.g., Crawford, 2007) (AC = 20). This is likely to be facilitated by their regularly reported ability to reflect interactively (i.e., while teaching; Anderson, 2019). Nonetheless, this flexibility does not mean there is an absence of structure in ETs' lessonsthere was also strong evidence that ETs have clear routines and procedures (AC = 14), particularly at primary level (e.g., Leinhardt, 1983), and of cohesion between the learning activities used, which were often found to be appropriate to the developmental levels of learners. ETs were also found to exhibit high time-on-task (i.e., time in which learners were engaged in learning activities, as opposed to administrative or behavioural distractions; see Hattie, 2009).
Consistent with their strong beliefs in the importance of interpersonal practices, ETs were found to create positive, supportive learning environments (AC = 12), develop close, meaningful relationships with learners, and cultivate mutual respect and trust within the classroom community (e.g., Bucci, 1999). These interpersonal skills were also evident in behaviour management practices, with ETs showing both sensitivity towards the emotional environment of the classroom and the ability to anticipate and prevent potential disturbances when these occurred (e.g., Wubbels et al., 2006), consistent with cognitive processes described above. Closely linked to their interpersonal practices-potentially as a result of them-expert teachers were frequently found to engage learners in the learning process, through their choice of specific activities, strategies or practices (AC = 17), and to make learning enjoyable, through their use of humour and intrinsically engaging activities.
With regard to interaction dynamics in the classroom, there was strong evidence that ETs make regular use of collaborative learning (the inclusion of groupwork and pairwork activities) (AC = 12), including, sometimes, the more specific practices of "cooperative learning" (e.g., Bevins, 2002;see;Johnson & Johnson, 2009). There was also evidence that this was balanced with teacher-led activities, such as whole-class teaching that involved a wide range of strategies to convey content (further evidence of ETs' well-developed PCK), and with independent (non-collaborative) seatwork (e.g., Conners, 2008). During both seatwork and pair/groupwork, ETs were frequently observed to monitor learning actively, circulating around the classroom and offering individual support, while also assessing progress and keeping learners on task (e.g., Smith & Strahan, 2004). This was prominent among several ways in which ETs were found to individualise learning, enabling them to provide differentiated instruction appropriate to learners' individual needs, interests or challenges.
ETs' regularly-reported belief in constructivism (see above) was also reflected in their classrooms (e.g., Traianou, 2006), with very strong evidence that they frequently link learning to learners' lives and prior schemata (AC = 17), facilitating learning of new content by building on what they already know. They often encourage peer tutoring (e.g., peer-feedback, peer-correction and peer-instruction), and incorporate inductive learning (e.g., problem-based and discovery learning) into their lessons.
While classroom dialogue was not a frequent focus of ET studies, ETs were often found to communicate effectively with learners, engaging in dialogue with them and making use of varied questioning strategies to involve learners, observed often during whole-class teaching (e.g., Anderson, 2021).
ETs develop their learners' cognition and metacognition effectively, with regular evidence of a focus on higher-order thinking skills (including critical thinking and creativity), as well as frequent attempts to scaffold learning (AC = 15) and build learner understanding of content (e.g., Chen & Ding, 2018), rather than simply memorisation of facts. They were also found to develop aspects of learners' metacognition, including appropriate study skills and self-regulation abilities that enabled them to learn more autonomously.
Finally, while summative assessment is rarely a focus of ET studies, there was strong evidence in the dataset that formative assessment is central to ETs' practice (e.g., Lin & Li, 2011), particularly what are sometimes called "dynamic assessment" and "assessment for learning" (see Leung, 2007) -a teacher's ability to assess continually as they teach (e.g., while questioning learners during whole-class teaching or while monitoring learner-independent activity work). Other, more specific assessment practices evidenced in the dataset include a focus on assessing prior knowledge before providing new instruction, and the provision of qualitative feedback to learners (i.e., spoken or written guidance, correction and support). These findings are largely consistent with those of Black and Wiliam (1998), who identified both formative assessment and qualitative feedback to be important positive influences on student learning.

Comparing teacher expertise at primary and secondary levels
Comparison of the findings of reports at primary (n = 31) and secondary (n = 66) levels was conducted through analysing rank order differences between the two groups. These differences are also displayed in Table 3 (Pri. Rank and Sec. Rank columns) for themes that achieved the overall AC threshold count of five or more.
As might be expected, in areas of cognition (knowledge base, cognitive processes and beliefs), few differences were detected when comparing primary and secondary teacher expertise, with near identical rank orders of themes. The exception to this was in the area of ET beliefs, where two beliefs were prominent among secondary ETs for which no agreement was found for studies involving primary ETs: beliefs in the importance of engaging learners and in having high expectations of learners. Conversely, two beliefs ranked higher among primary than secondary teachers: beliefs in constructivism and in their need to develop learners as human beings/future citizens. In the area of personal attributes, small differences were also detected, with two attributes ranking noticeably higher for secondary ETs (their passion for their work and their positive self-image/self-confidence) and one higher for primary ETs (an optimistic/positive world view). In the area of professionalism, very similar rank orders were found, although it is notable that reflection (particularly critical reflection) was more frequently reported in secondary contexts (AC = 16) than primary (AC = 4).
With regard to pedagogy, variation was not as high as might be expected, given important differences at the two levels (OECD, 2018). The ability to display flexibility/improvise while teaching topped the list for both groups (further evidence of its importance), and several others scored very highly across both levels (linking learning to learners' lives/schemata, scaffolding learning effectively, and making regular use of collaborative learning). Further, of the 89 pedagogy-related themes in the dataset, only five that ranked in the top 20 at each level were found not to make the top 25 at the other level. Interestingly, four of the five themes that ranked noticeably higher at secondary level related to interpersonal elements of teaching (developing close meaningful relationships with learners, cultivating mutual respect/trust, showing sensitivity to learners' emotional needs, and making lessons enjoyable for learners; the fifth was interactive reflection). However, those that ranked noticeably higher at primary level varied more, including developing learners' study skills/autonomy/metacognition, adapting core curriculum materials, encouraging peer tutoring and appropriate teacher talk/communication.
This stratified analysis indicates that many key features of expert teacher pedagogic practice hold true across primary and secondary contexts, although interpersonal and affective themes (both in beliefs and practice) seem to be more prominent in the findings of secondary studies. This finding may signal an important difference between secondary ETs and their non-expert peers that may be less evident (and therefore less reported) at primary level, where close, caring relationships may be the norm. Alternatively, it may indicate the heightened importance of good relationships with teenage learners (e.g., gaining respect and trust) as key prerequisites for effective secondary learning. Prior research indicates that, on balance, for various reasons (e.g., the higher number of teachers each learner has), it becomes "more difficult for teachers and students to build positive relationships" in secondary grades (Bru et al., 2010, p. 530; also see, e.g., Roeser et al., 1998). Thus, the ability of secondary ETs to do this may be critical to their success at this level.

Extended discussion of key findings
While necessarily somewhat generic (unsurprising, considering the range of contexts and levels involved), the prototype of expert teacher pedagogic practice that emerges from within the dataset is one of primarily, but not wholly, "learner-centred" instruction, largely consistent with both Schweisfurth's (2013) seven "minimum standards" for learner-centred education (p. 146), and Bremner's (2021) more empirically derived framework for learner-centred education (pp. 166-170). However, the prototype is more extensive than either of these in that it also supports the inclusion of more teacher-led practices, such as interactive whole-class teaching (see Campbell et al., 2004). This evidence for both teacher-led instruction and learner-centred activities is important, particularly for international development initiatives in education that often emphasise the latter while overlooking the former (e.g., UNICEF, 2015), and also offers support for Direct Instruction (an approach that incorporates both types of learning; see Adams & Engelmann, 1996), found in Hattie's (2009, p. 204) meta-analysis to have one of the highest effect sizes (0.59) of those teaching approaches he investigated.
The data discussed above also offers repeated evidence for expert teachers' ability to balance effectively between two apparently contradictory needs in their pedagogic practice: structure and improvisation. To maintain structure, expert teachers plan lessons carefully, ensure core curriculum content is covered and often incorporate regular routines in the classroom. Yet this structure remains amenable to the responsive needs of learners, evidenced both through expert teachers' willingness to adapt and supplement core content when necessary, and their ability to respond flexibly to learning (assessed dynamically) as it happens in the classroom, improvising and differentiating when necessary. This finding offers further support for the importance of Hatano and Inagaki's theory of "adaptive expertise" (1986) in teaching (see Carbonell et al., 2014) and is consistent with Yinger's (1987) construct of "improvisational performance". This key flexibility of expert teachers may depend on their ability to reflect interactively (Anderson, 2019), made possible by the automatization of more routine processes, both of which are also well documented in the findings above.
The findings of this study also offer strong support for the importance of interpersonal practices as key to effective teachingthe ability of expert teachers to build and maintain good relationships with their learners, engage them effectively in class and both prevent and manage off-task behaviour appropriately, particularly evident at secondary level. These skills may correlate with another consistent finding in the datasetthe care that expert teachers exhibit towards their learners; this relationship may be usefully examined in future research to assess the extent to which caring personalities among teachers are consistent with effective interpersonal practices and more effective teaching as a result, as suggested by Korthagen (2004).

Critical discussion of methodology and findings
This study has attempted to achieve an ambitious objective: to bring together the findings from a diverse range of studies linked only by an intention to investigate the same phenomenonteacher expertise. As such, it necessitates a number of critical reflections based on the methodology adopted and how we have adapted it to this use.
Firstly, because we chose to limit this review specifically to studies investigating teacher 'expertise', a large number of papers that use alternative terms to investigate comparable measures of quality (e.g., effective/successful/competent/good/great teachers/ teaching) are excluded from the analysis. We feel that this is justified, given that, as we argue above, expertise can be seen to be a broader, more appropriate measure of teacher quality than 'effectiveness' (Bond et al., 2000;Hattie, 2003;Stigler & Miller, 2018). What is more, there are a large number of parallels between our findings and those in the literatures on teacher effectiveness (e.g., Stronge, 2007) and teacher professional competence (e.g. ) that indicate a degree of overlap between these constructs that is worthy of further investigation.
Secondly, as a result of our intention to avoid a priori imposition of a specific theoretical framework on the dataset, we chose to develop themes inductively, only later grouping these into the six domains reported on. It is important for us to acknowledge that these domains constitute a post hoc framework of our own making. While most themes were easily categorizable within this framework, grey areas were also found (e.g., 'reflection' as 'cognition' or 'professional practice'; 'dedication' as 'personal attribute' or 'professional practice'); other categorisations are also possible. Another outcome of our inductive process is that the themes identified sometimes operate at different hierarchical levels. For example, while 'peer tutoring encouraged' constitutes a quite specific observation on how ETs facilitate learner-centred education, 'engages learners through practices/content/activities/strategies' may be seen as a more general observation. If we choose to lump, for example, perceived learner-centred strategies under a single theme, it would obviously rank higher in our findings. While we have attempted to present our findings in a way that is as close as possible to our original analysis in order to maintain transparency and reduce the influence of our own subjective understandings on the dataset, some may see alternative classifications and patterns that are worthy of exploration.
Thirdly, and linked to the previous two points, it should be acknowledged that 'expertise' itself is a social construct, with different composite elements in various understandings of it, which themselves are likely to be culturally and contextually specific (Anderson, in press;Ericsson, 2018;Stigler & Miller, 2018). In this sense, what emerges here is likely to be as much a reflection of the interests of researchers of expertise as it is evidence for the prevalence of the findings reported upon in the cognition and practice of the ETs themselves. For example, differences reported in the stratified findings between primary and secondary levels may, in part, reflect differences in agendas or interests among both academic and practitioner communities for the two stages of education. 5 Likewise, it should be noted that there are strong biases in the contexts of the available studies analysed, involving mainly higher-or upper-middle income countries or regions (World Bank, 2019), particularly the US, where the majority of studies were conducted. As such, the resulting prototype is primarily indicative of expertise in these more privileged contexts. Only three studies in the sample were conducted in the lower-income contexts typical of the global South (Anderson, 2021;Chantaranima & Yuenyong, 2014;Toraskar, 2015). Further such studies may indicate important differences across this significant socio-economic divide, such as the importance of complex multilingual practices to expert teachers in India (see Anderson, 2022) that may also hold true for other Southern contexts (see Anderson, in press, for detailed discussion of teacher expertise in the global South).

Specific methodological limitations
The challenges involved in conducting this systematic review require us to acknowledge a number of limitations concerning the methodology of this study, as follows: The agreement count threshold of five that we adopt is, to some extent, arbitrary but also carefully considered. While an agreement count of two or three could result from measurement error (considering our moderate inter-rater reliability score) or the shared findings of several lower quality studies (metasummary uses the quantifying measures adopted here to reduce this danger; , as the agreement count increases, the likelihood of these two dangers falls. Thus, an agreement count of four is likely to be reliable (hence, our choice to mention a number of findings in our discussion above that meet this threshold) and five much more so.
Due to our choice to adopt this high agreement count threshold, we also caution that non-inclusion in Table 3 does not indicate that a specific practice or teacher attribute is absent from the practices or cognition of expert teachersthere may be many further such themes within the prototype that may reveal themselves in future research. What we present here is only an initial 'skeleton' or 'sketch' of this prototype necessitated by our methodology.
Because reports on different content subjects (maths, sciences, languages) are included in our sample, our study describes general (non-subject-specific) characteristics of ETs. However, this should not be taken to indicate that subject-specific ET characteristics do not exist (they likely do; see evidence of the importance of PCK above, also Popova et al., 2019). Future subject-specific reviews may elucidate such characteristics, although these will likely require an alternative methodology (e.g., metasynthesis) due to the much smaller sample sizes involved.
Finally, the coding list for the ET themes was generated by only two researchers, and primarily by one of these. This may have biased the focus and organisation of the coding themes in the direction of our own interests and personal backgrounds. As Braun and Clarke (2006, p. 96) observe, "the researcher is … active in the research process; themes do not just 'emerge'". As such, other researchers may have identified themes that we have overlooked, or organised them differently.

Conclusion
This study has attempted to synthesise and summarise the findings of 106 empirical teacher expertise study-reports in a way that is both systematic and replicable (Maeda et al., 2022;Page et al., 2021). It presents a large number of findings that are of potential interest to those working in primary and secondary education around the world, particularly the strong evidence uncovered for the importance of PCK, reflection, improvisational performance and strong interpersonal relationships with learners among expert teachers.
We believe that the findings presented here are sufficiently well supported in this diverse literature and (invariably) well evidenced across national contexts to be worthy components of Sternberg and Horvath's expert teacher prototype (1995), and see this as the primary contribution of our study. Once more, we repeat their concern (p. 9) that this prototype should not be seen as a list of necessary and sufficient criteria for expertise, but only as a list of frequently shared featuresthe "family resemblances" of expert teachers. It is important also to note that we have not attempted to isolate these features as unique to expert teacherssome will also be shared by their (non-expert) colleagues. Further, because the research evidence itself indicates that expert teachers are not all alike (Anderson, in press;Sternberg & Horvath, 1995) none are likely to exhibit all the features presented.
Our study also offers a potentially useful modification of Sandelowski and Barroso's metasummary (2007;, enabling this versatile methodology to be used for the review of highly complex and only loosely associated literatures (the "fuzzy core" of expertise as a construct; Anderson, 2021) with the addition of two novel criteria (independent agreement and threshold) as a potentially useful alternative to effect size for identifying salience in reviews of what are often highly diverse studies underpinned by very different ontologies and paradigmatic positions.
Providing the critical discussion offered above is kept in mind, we believe that this study can serve as a useful starting point, both for future teacher expertise research and the use of metasummary in other fields of educational research, where it may help to bridge the paradigm gap between qualitative and quantitative methodologies and discourse communities. We also believe that our findings constitute the most detailed expert teacher prototype ever assembled, and while this is still relatively vague, it may serve as a useful resource for a number of areas of education. Firstly, the 73 themes presented in Table 3 describe aspects of the expert teacher that may usefully constitute foci for teacher education, both through investigation in pre-service programmes and through promotion and exploration (e.g., through action research and lesson study) via in-service professional development. Secondly, these same themes may serve as a useful tool for teacher self-evaluation and critical reflection to assist in the identification of areas of potential focus for their own continuous professional development. Related to this, the themes may also prove useful for teacher quality assessment, not as a checklist of dos and don'ts but as a means to cross-evaluate and adapt existing instruments, providing the themes are deemed culturally and contextually appropriate. Finally, a number of the findings presented here may also serve as a basis for future research, particularly that which is able to investigate potentially causative relationships between features of the expert teacher prototype. For example, the extent to which adaptive expertise is dependent on experience (see Riel & Rowell, 2017), the relationship between interpersonal practices and a teacher's proclivity to care for their learners, and how aspects of learner-centred and teacher-led classroom practice are balanced in the classes of expert teachers in specific contexts worldwide. Studies investigating these and other themes discussed here are likely to further 'flesh out' the expert teacher prototype to enable us to finally identify and value what makes expert teachers effective.