Opening up the black box: Teacher competence, instructional quality, and students ’ learning progress

Existing research indicates inconsistent or at best weak predictive effects of teacher knowledge on student achievement. Data from Germany were used to examine the relation between teachers ’ content and pedagogical content knowledge, their perception, interpretation, and decision-making skills, the instructional quality implemented in class, and students ’ learning progression in mathematics. Rather than direct effects of teacher knowledge on students, we hypothesized an effect chain with multiple mediation processes while controlling for school type and student background. Multi-level modeling with 3496 students from 154 classrooms revealed a mediating role of teachers ’ skills and their instructional quality for the relation between teacher knowledge and students ’ learning progress. Effect sizes were medium to strong, and the model explained a large amount of variance. No direct effects of teachers ’ knowledge on student progress were found. We discuss our findings with respect to the teacher-competence-as-a-continuum model and with respect to future research.


Introduction
The role of teacher competence in students' learning progression has become a key topic in educational research. This is particularly the case in the field of mathematics education (Kaiser & König, 2020). The quality of corresponding studies has increased substantially over the past years, yielding results that are more reliable and valid. Studies have progressed beyond self-reports of teachers' competence by including standardized tests of different competence facets and relating these to student achievement (e.g., Baumert et al., 2010;Hill et al., 2005;Kersting et al., 2012). Other studies have examined the role that teacher competence plays in the quality of instruction delivered in the classroom (e.g., Charalambous & Praetorius, 2018;Jentsch et al., 2021) and progressed beyond cross-sectional designs by implementing a longitudinal component at the student level to assess the relation between learning progression and selected facets of teacher competence (e.g., Kunter et al., 2013).
However, this research has not yet managed to establish a robust link between teacher competence and student progress. Effect sizes varied between weakly positive and weakly negative estimates (Blömeke & Olsen, 2019;Seidel & Shavelson, 2007) with most studies revealing no effects at all, particularly with respect to the relation of teachers' content knowledge as one competence facet to student outcomes. However, a major limitation of this research was that only few studies have examined the full effect chain, including a broad range of teacher competence facets, instructional quality, and students' learning progress within one study. Most studies were either limited in scope with respect to the range of competence facets assessed, treated the relation between competence and student achievement as a "black box" by omitting instructional quality as Baumert et al. (2010) described it (see also Hill et al., 2005), or implemented a cross-sectional design (e.g., Blömeke & Olsen, 2019).
Against this background, the purpose of the present paper is to examine the relation between teacher competence, instructional quality, and students' learning progression more comprehensively. This applies, first, to the range of cognitive competence facets included and, second, to the mediating processes which may transform teacher competence into student progress.

Teacher competence
During the last two decades, scholars have established elaborate models of teacher competence that conceptualize a broad range of cognitive and affective-motivational characteristics teachers need to successfully perform their work (e.g., Baumert & Kunter, 2006; for an overview see . To promote the development of domain-specific student cognitions, these models typically stress the relevance of domain-specific cognitive competence facets, more precisely of teachers' domain-specific knowledge on the one hand and their domain-specific cognitive skills to perceive classroom situations, to interpret these and to make decisions on the other hand . These two types of cognitive competence facets are therefore the focus of the present study. Affective-motivational teacher characteristics are also regarded as important for domain-specific student outcomes but are not at the forefront of this study. Based on Shulman's seminal work (1986), teachers' domain-specific knowledge can be divided into content knowledge and pedagogical content knowledge. In the case of mathematics teachers, the former denotes mathematical content knowledge (MCK) and the latter denotes mathematics pedagogical content knowledge (MPCK). MCK includes knowledge about content domains, such as number, algebra, geometry, and data, which should provide teachers with the necessary background knowledge for teaching. MPCK covers curricular knowledge and planning for mathematics teaching and enacting, which should provide teachers with the necessary knowledge for in-class lesson activities (e.g., Tatto et al., 2008;Kunter et al., 2013). Empirical studies have demonstrated that these facets constitute two distinct knowledge dimensions (e.g., Krauss et al., 2008;Baumert et al., 2010;Blömeke et al., 2016).
Teachers' domain-specific cognitive skills are organized in line with classroom situations with particular emphasis on situations that are decisive for students' domain-specific learning progression, more precisely with emphasis on those characteristics important for high quality in these situations, such as their instructional design, the potential for students' cognitive activation, individual learning support, and classroom management . A crucial distinction between teachers' knowledge and skills is their proximity to observable behavior in class. Whereas knowledge comprises generalized cognition not necessarily related to one specific classroom situation, cognitive skills are typically organized in a context-related way (Putnam & Borko, 2000). It is important to consider that these skills are conceptualized as cognitive teacher characteristics and are not equal to observed behavior, which is discussed in the next section called "instructional quality". Cognitively, teachers need to perceive a specific classroom situation as relevant and interpret its different aspects to be able to determine how to act, for example, by anticipating potential student responses or developing alternative instructional strategies (cf. Jacobs et al., 2010). Studies indicate that teachers' cognitive skills can be described as two-dimensional, with one generic and one domain-specific facet (Blömeke et al., 2016). The latter facet is focused in the present study; its three components-perception, interpretation, and decision-making (PID)-have been identified being as interrelated due to their process-oriented character, to the extent that they are difficult to disentangle (Santagata & Yeh, 2016;Stahnke et al., 2016).

Instructional quality
Instructional quality reflects teachers' observable classroom behavior and can be defined by three dimensions conceptualized to some extent independently of a specific teaching domain-namely classroom management, cognitive activation, and student support Praetorius et al., 2018)-and domain-specific quality characteristics which are, in case of our study, related to mathematics education (Schlesinger et al., 2018). In a literature review, Charalambous and Praetorius (2018) compiled evidence that predicting students' learning outcomes in mathematics based on domain-specific and generic characteristics of instructional quality may be more successful than when predicted based solely on generic characteristics. Empirical studies applying different observation instruments support this comprehensive approach to modeling INQUA (Kane & Staiger, 2012). Inferring from the way how these are conceptualized in the literature and depending on the breadth of characteristics included, this domain-specific quality can be modelled as consisting of one or several dimensions (Kyriakides et al., 2013;Seidel & Shavelson, 2007).
Models of instructional quality typically conceptualize classroom management as the efficient use of allocated classroom time, the prevention of disorder in the classroom, and the clarity of organizational rules . Diagnosis of students' learning, provision of opportunities for individualization and differentiation, and the creation of a good teaching climate are subsumed under the notion of student support (Fauth et al., 2014). Cognitive activation refers to whether students are challenged by higher-order thinking through teachers' instructional strategies and the learning tasks selected Lipowsky et al., 2009). In our study, educational quality in mathematics is characterized by an appropriate presentation of the content and content-specific interaction with students, including the provision of domain-specific feedback (Schlesinger & Jentsch, 2016;Jentsch, Schlesinger, Heinrichs, Kaiser, König, & Blömeke, 2021).

Students' learning progression
The framework underlying national and state-wide tests of student achievement in school mathematics in Germany encompasses the socalled "big ideas" that form a three-dimensional model (Blum et al., 2006). The content assessed comprises numbers, measurement, space and form, functional relations, and data and chance. The processes assessed comprise argumentation, problem solving, modeling, representations, using symbolic, formal, and technical elements, and communication. These dimensions are organized along three levels of achievement: reproduction, making connections, and generalization and reflection.

Relations between teacher competence, instructional quality, and learning progression
Given the weak correlations identified hitherto between teacher knowledge and student achievement, Blömeke et al. (2015) proposed a more elaborate model hypothesizing potential mediating processes by drawing on insights from cognitive psychology. Rather than treating the relation between teacher knowledge and student achievement as a "black box", they considered it crucial to evaluate how distal or proximal teacher and teaching characteristics were to a desired outcome.
The model by Blömeke et al. (2015) conceptualized teachers' knowledge facets as traits that are relatively stable across different situations, representing the potential that a teacher brings to the classroom. This part of the model is similar to the teacher knowledge models developed, for example, by Ball et al. (2008) or Baumert and Kunter (2006).
The model by Blömeke et al. (2015) also considers Schön's (1983) concepts of reflection in and on action by including cognitive processes immediately prior to, during, and following typical classroom situations (Star & Strickland, 2008). Teachers' skills to perceive, interpret and make decisions, that is their PID skills, were thus conceptualized as a context-related and situated facet of their cognitions, equal neither to more generalizable knowledge nor to observable teaching behavior in specific classroom situations. This model thereby merged the previously discrete understandings of teacher competence from a dispositional or a situated perspective (Rowland & Ruthven, 2011). PID skills were thus conceptualized as mediators between teacher knowledge and teaching behavior in the classroom and functionally related to both. Likewise, Meschede et al. (2017, p. 159) argued that teachers' skills were an "essential mediator" between teachers' knowledge and their teaching.
Observable teaching behavior was conceptualized as being most closely related to student outcomes. Summing up, Blömeke et al. (2015) suggested considering an effect chain that acknowledges teachers' PID skills as an important mediator in the transformation of teacher knowledge into teaching behavior and this teaching behavior as another mediator into students' learning progression.
Whereas this theory argues well for a potentially causal chain-knowledge-skills-behavior and outcomes-some parts of the model are less elaborate. This applies first to the relationship between different knowledge dimensions and teachers' PID skills, and second to direct versus indirect effects. Blömeke et al. (2015) leave it open how exactly MCK and MPCK are related to PID skills. MCK is less clearly related to mathematics teaching than MPCK as MCK includes general mathematical knowledge that shall serve as foundational background knowledge whereas MPCK is related to teaching tasks (for examples see sections 3.1 and 3.2 in the electronic supplement). The legacy of research on cognitive abilities points in such cases to placements at different hierarchical levels. More general abilities are assumed to underly more specific ones (Jensen, 1998). A reasoning in line with such a model would mean that MPCK would be hypothesized to have direct effects on teachers' PID skills while MCK would be hypothesized to be predictive for MPCK (cf. similar discussions by Ball et al., 2008;Baumert et al., 2010).
The description by Blömeke et al. (2015) gives in addition rise to assuming indirect effects of teacher knowledge on instructional quality and students' learning progression only. However, instead of hypothesizing that effects of MCK and MPCK on teaching behavior or student outcomes are fully mediated by teachers' PID skills, which is a very strict requirement, direct effects would-at least to some extent-be plausible as well. With respect to the relation between teacher knowledge and instructional quality, such direct effects may reflect a richer domain-specific terminology, for example.

Relation of MCK and MPCK to teachers' PID skills
As noted, studies intending to establish direct relations between mathematics teachers' content knowledge and students' learning progression without considering potentially mediating characteristics of the hypothesized effect chain knowledge-skills-teaching behavior-outcomes typically found no or weak predictive effects (e.g., Hanushek et al., 2005;Hill et al., 2007). Direct effects of MPCK on student achievement have rarely been examined: we identified one study where a significant medium (according to Cohen, 1988; see this reference also in the following for categorization of effect sizes) effect was found for German lower-secondary mathematics teachers (Baumert et al., 2010).
More studies have examined small parts of the hypothesized effect chain, identifying systematic relations more successfully. The examination of the relation between MCK and MPCK has predominantly revealed strong correlations (e.g., Tatto et al., 2012). Regarding their relation to teachers' PID skills to perceive, interpret and make decision and potential causality, Dunekacke et al. (2015) demonstrated in the context of German early childhood education and care that MCK was a necessary precondition for MPCK. Baumert et al.'s (2010) findings with respect to lower-secondary mathematics teachers can be interpreted similarly.
In both projects, teachers' PID skills were also assessed (though differently operationalized). The data revealed that both teachers' MCK and MPCK were significantly positively related to these skills with a stronger effect size for MPCK than for MCK (Blömeke et al., 2016;Bruckmaier et al., 2016). A significant relation between both MCK and MPCK and skills--focused on mathematical representations--was also identified in Dreher and Kuntze's (2015) study among German lower-secondary teachers. However, the effects were inconsistent for different groups of teachers with MCK showing significant effects for student teachers but MPCK for practicing teachers. US studies of mathematics teachers revealed significant relations between different types of their content knowledge and their skills but did not include pedagogical content knowledge (Kersting et al., 2012;Hill & Chin, 2018). Overall, existing research indicates a positive relation between MCK and MPCK and teachers' PID skills, but the exact nature of its interplay remains unclear.

Relation between teacher competence and instructional quality
The relation between the different facets of mathematics teachers' competence and the instructional quality implemented in class has also been examined in a range of studies, albeit mostly limited to one or two competence facets and some facets of instructional quality. Baumert et al. (2010) found a significant relation of medium effect size between MCK and a domain-specific dimension of instructional quality-curricular alignment-but not to the three generic dimensions. MPCK was related to cognitive activation, also with a medium effect size, but not to curricular alignment, student support, or classroom management (see also Kunter et al., 2013). Hill et al. (2007) provided evidence with respect to primary education and the domain-specific characteristics of instructional quality as follows: Teachers with lower content knowledge made more mathematical errors in their instruction, while teachers with higher content knowledge used richer representations, explanations, and justifications. Effect sizes were medium to large. Kelcey et al. (2019) replicated these findings. Bruckmeier et al. (2016) also demonstrated that teachers' skills were related to instructional quality. Kersting et al. (2012) conducted a study that included both a paper-pencil test of mathematics teachers' knowledge and an assessment of teachers' cognitive skills where they had to answer questions related to video-cued classroom situations, both focused on fractions. They found no direct effects of the results of the paper-pencil test on instructional quality but found a relation of medium effect size between the two instruments and an effect of large size of the video-based measure on instructional quality when both measures were included. This may indicate that only teachers' skills are related to instructional quality, while MCK may have an indirect effect on it via teachers' skills. Overall, it seems plausible to infer an effect of teachers' knowledge and skills on instructional quality from such findings, but the precise interplay remains unclear.

Relation between instructional quality and students' learning progression
Finally, studies have examined the relation between instructional quality and students' learning progression. Data from the German-Swiss "Pythagoras" study (Lipowsky et al., 2009) revealed positive effects for both classroom management and cognitive activation of small size. Hill et al. (2007), Kersting et al. (2012), and Kelcey et al. (2019) showed that higher quality instruction positively affected student learning gains and mediated the role of teachers' content knowledge. Effect sizes varied in these studies. In the study by Kunter et al. (2013) high levels of cognitive activation and efficient classroom management were also found to mediate the effect of MPCK and to predict greater achievement gains. Overall, the state of research is consistent in treating instructional quality as a mediator of knowledge effects on student progression. In particular, cognitive activation and effective classroom management are those basic dimensions of instructional quality that contribute to learning gains among students.

Potentially confounding variables
To estimate teacher effects on student outcomes correctly, it is important to control for potentially confounding variables that would otherwise create bias in the results due to non-random student-teacher assignments (Koedel et al., 2015). Reviews of studies in the context of educational effectiveness research and value-added modeling point to prior achievement, student background, and school context as core variables that were significantly related to outcomes-mostly with large effect sizes. They should therefore be included in studies to ensure fair comparisons (Levy et al., 2019). Furthermore, most studies mentioned above revealed confounding effects of student background and-in Germany, particularly-school type (e.g., Baumert et al., 2010), with large effect sizes.

Research questions and hypotheses
The state of research allows for deriving directional hypotheses, in a few cases also including conjectures on the degree of the effect size based on the reliability and the expected proximity of the constructs assessed compared to other studies. However, in most cases it is impossible to infer the potential effect size from the literature due to greatly varying study designs and results. Based on the research described above, the relations between teacher competence, instructional quality, and students' learning progression are hypothesized as follows (see Fig. 1): 1) Teachers' MCK has a positive predictive effect on MPCK (H1a).
MPCK has a positive predictive effect on teachers' PID skills related to mathematics instruction (H1b). These skills are expected to at least partly mediate the effect of MCK (H1c). MCK may or may not have an additional direct effect on teachers' skills. 2) It is then hypothesized that mathematics teachers' PID skills predict students' learning progression in mathematics (H2a) and that the effects of MCK and MPCK are at least partly mediated by these skills (H2b). MCK and MPCK may or may not additionally directly affect student progress. 3) Having clarified the relation of MCK and MPCK to teachers' PID skills and students' progress, teacher competence is hypothesized to be positively related to instructional quality (H3a), which in turn significantly positively affects students' learning progression in mathematics (H3b). Furthermore, instructional quality is hypothesized to play a mediating role regarding the effects of teacher competence and student progress (H3c). Teachers' PID skills, MCK or MPCK may or may not have direct additional effects on student progress.

Methodology
In the following, the context of the study, the sample, psychometric properties of all instruments including item examples, and the data analysis including results of the different robustness checks are described briefly. For details see the electronic supplement.

Context of the present study: TEDS-Instruct and TEDS-Validate
The data stem from two studies-TEDS-Instruct and TEDS-Validate -conducted in Germany within a research program departing from the international "Teacher Education and Development Study in Mathematics" (TEDS-M) in which MCK and MPCK tests were developed (Tatto et al., 2008). In a German follow-up study (TEDS-FU; Kaiser et al., 2015), Blömeke et al.'s (2015) broader teacher competence framework was applied; and tests of mathematics teachers' PID skills were developed. TEDS-Instruct, carried out in the federal state of Hamburg, and TEDS-Validate, carried out in the federal state of Thuringia, added tests of students' learning progression (Kaiser & König, 2020). Both states have two types of middle school: an academic track called Gymnasium and a non-academic track called Stadtteilschule (Hamburg), Regelschule, or Gesamtschule (Thuringia).

Sample
Our sample comprised 3496 students from 154 classrooms. Of the students, 51.2% were female, 15.8% did not speak German as their first language, and 4.8% had special needs. Average class size was 22.7 students. In the federal state of Hamburg, the classes sampled were tested at the beginning of grade 7 in 2012, 2013, and 2014 and followed up about 1.5 years later at the end of grade 8. In the federal state of Thuringia, the classes sampled were tested at the end of grade 6 in 2012, 2013, 2014, 2015, and 2016 and followed up about two years later at the end of grade 8. The 154 classes were taught by 89 teachers. About half of the teachers were female, and 63.2% were teaching at a Gymnasium. On average, they were 40 years old and had been teaching for 13 years.
Teachers' MCK, MPCK, and PID skills were assessed simultaneously during one computer-based session that could be paused once. Observations of teachers' instructional quality were available for 49 teachers because this element had not been included in all studies. Since the vast majority of these teachers (n = 37, teaching 879 students) were from Hamburg, we restricted the models including instructional quality to Hamburg to avoid an unbalanced sample. Teachers volunteered to participate in the study. Therefore, we must assume self-selection bias.

Teacher competence
Mathematics teachers' competence was assessed with digitalized tests that covered MCK, MPCK, and PID skills (Kaiser et al., 2015). Scaled scores were created by applying item-response theory as implemented in the software package Conquest (Wu et al., 1997). Items omitted or not reached were considered as incorrect responses. MCK was assessed internet-based with an abbreviated version of the original paper-and-pencil TEDS-M test (Tatto et al., 2012), validated in several studies (see Blömeke et al., 2016 for an overview). The 27 items included numbers, algebra, data, and geometry as core areas of school mathematics. Reliability both in terms of Cronbach's α and Warm's weighted likelihood estimate (WLE) was 0.81. MPCK was assessed with the original TEDS-M test (Tatto et al., 2012). The 28 items covered curricular and planning knowledge as well as knowledge about how to teach mathematics. Reliability was α = 0.79 or WLE = 0.78.
Teachers' PID skills were assessed video-based with 32 items requiring them to perceive, interpret, and make decisions with respect to typical classroom situations presented in three scripted video clips (Kaiser et al., 2015). The clips served as cues and lasted between 2.5 and 4 min. The scale's reliability was α = 0.72 or WLE = 0.73.

Instructional quality
Instructional quality was assessed using a standardized observation protocol (Schlesinger et al., 2018). The 21 items covered classroom management (4 items), student support (5), cognitive activation (6), and mathematics educational structuring (6) as indicators of one latent variable to reduce complexity, given that we examined the entire effect chain from teacher knowledge to student achievement, and to avoid multicollinearity. Confirmatory factor analysis revealed a good fit of the data to such a model (CFI = 0.96, RMSEA = 0.02, SRMR BETWEEN = 0.04, X 2 = 2.6, df = 2, p = .27).

Student achievement in mathematics
Student achievement at the first measurement point was assessed by state-wide tests developed in the federal states of Hamburg and Thuringia based on the German national standards. At the second measurement point, student achievement was assessed by a national test based on these nation-wide standards. The data in terms of person parameters (WLE) resulting from item-response-theory scaling were provided by the respective units responsible for testing in the two federal states. Since the data came from different cohorts, we refrain from comparing absolute values across groups. Since the period between pre-and post-tests also varied (1.5-2 years), we also refrain from reporting students' absolute learning gains.

Control variables
As an indicator of students' average educational background, we used their language background. Since this information was not available at the individual level, we included the proportion of students with German as their first language at class level, being aware that the meaning of this variable may differ as a function of aggregation (see the Limitations section). Given the differences in learning opportunities between the Gymnasium and the non-academic track in Germany, the type of school attended was included as a school context variable.

Data analysis
We applied a series of two-level random-intercept mediation models with students on the first and classes on the second level to test our hypotheses. To account for the difference in the number of classes taught by one teacher, we implemented a teacher weight that was inversely proportional to the number of classes taught. Variables were groupmean centered at the within and remained uncentered at the between level. Missing data were handled using the full information maximum likelihood procedure. To examine the robustness of our model, we applied several alternative approaches. The differences were negligible. All analyses were conducted using the statistical software package Mplus version 8 (Muthén & Muthén, 1998. Direct and indirect effects were estimated. Based on Cohen (1988), we interpreted coefficients around 0.10 as weak, around 0.30 as moderate, and around 0.50 or larger as strong direct effects. In its squared version, a small indirect effect size is around 0.01, medium 0.09, and large 0.25.

Descriptive results and direct effects of teacher knowledge
The intra-class correlation of students' mathematical achievement was 0.54, indicating large differences between classrooms. The predictive effect of achievement at the first time point for results at the second time point was high (β = 0.88) on the between-level, and results from the pre-test explained more than three-quarters of the variance in the post-test results (see Model 1 in Table 1).
School type was strongly related to students' learning progression in mathematics (β = 0.46), while the effect of prior achievement decreased substantially (β = 0.52; see Model 2 in Table 1). Student background-strongly correlated with school type (see Table 2 in the electronic supplement)-had no separate predictive effect on learning progress (β = -.00). As expected, no significant direct effect of teachers' MCK (β = 0.03) or MPCK was observed on students' learning progress (β = -.05; see Model 3 in Table 1, "black-box" model).

Relation of teacher competence to students' learning progression (H1, H2)
We next tested the relation between teacher competence and students' learning progression in mathematics. Teachers' PID skills were hypothesized to mediate at least partly the effects of teacher knowledge on student progress (H2), while MCK was hypothesized to be a predictor of MPCK, which in turn should predict teachers' PID skills (H1, see Fig. 1). Since student background did not contribute beyond school type, we omitted this control variable in favor of more parsimonious models given small sample size.
Model 4 (see Table 2) reflected MCK's effect on teachers' PID skills in perceiving and interpreting classroom events and deciding how to proceed mediated by their MPCK (H1c). This model was supported by the data. MCK had a strong predictive effect on MPCK (H1a, β = 0.69), which in turn had a strong effect on PID skills (H1b, β = 0.50).
Teachers' PID skills had a significant but weak predictive effect on students' learning progression in mathematics (β = 0.12). They fully mediated the effects of MPCK (β = 0.06) on student progress. This means that there were no direct effects of MPCK on student progress. The same applied to MCK. With respect to this knowledge dimension, MPCK and teachers' PID skills combined fully mediated its effect on learning progression (β = 0.04). Mediation through MPCK (-.07) or skills (0.01) only was insignificant. Since the direct effects of teacher knowledge on student progress were insignificant (MCK = 0.05, MPCK = -.10), the total effects of MCK (0.03 in model 4) and MPCK (-.05) were small and insignificant. Similar to the basic models, school type was an important variable that needed to be controlled for (0.45).

Testing the full effect chain with instructional quality as another mediator (H3)
Finally, we included instructional quality as another mediator of teacher competence's effect on students' learning progression in mathematics. This analysis was, as already mentioned, restricted to the sample from the federal state of Hamburg. Given the small number of classrooms, we proceeded with a parsimonious model in which (the previously insignificant) direct effects of MCK on teachers' PID skills or student progress or of MPCK on student progress were no longer modelled. Given the inconclusiveness regarding the role of teachers' skills, we tested competing models, one with indirect effects only regarding this competence facet (see Model 5 in Table 3) and one with an additional direct effect on students' learning progress (Model 6 in Table 3). Since school type was not significantly correlated with instructional quality (see electronic supplement, Table 2), we estimated both models twice, once without (Models 5a and 6a in Table 3) and once including the control variable (Models 5b and 6b in Table 3; for the fully saturated model see Model 7 in supplement Table 5 in the electronic supplement; differences in the results are negligible.) Similar to the models reported above, teachers' MCK significantly and with a large effect size predicted their MPCK in all models (0.81). MPCK in turn significantly and with a large effect size predicted teachers' PID skills (0.78). Furthermore, as hypothesized (H3a), teachers' PID skills significantly and with a large effect size predicted instructional quality in all models (0.53-0.58). The remaining results varied depending on the model. Model 5a (see Table 3) reflected a model with indirect effects of teacher competence on student progress without controlling for school type. As hypothesized (H3b), instructional quality significantly predicted students' learning progression in this model (0.18). The indirect effects of all three facets of teacher competence were also significant and of medium effect size (PID skills: 0.10; MPCK: 0.08; MCK: 0.07).
The picture changed when a direct effect of teachers' PID skills on student progress was included in the model (see Model 6a in Table 3). This effect was significant (0.17), while the effect of instructional quality disappeared. In line with these results, the indirect effects of teacher knowledge on students' learning progress were significant via teachers' PID skills but not via instructional quality (MCK: 0.10; MPCK: 0.13). The total effects of teachers' competence facets were also significant with somewhat larger effect sizes than in the purely indirect model (MCK: 0.13; MPCK: 0.15; PID skills: 0.20). If indirect paths from a facet of teacher competence to students' learning progress included instructional quality, none of the effects was significant (MCK: 0.02; MPCK: 0.02; PID skills: 0.03).
When we controlled for school type, the relation between instructional quality and student progress in mathematics disappeared in the model including indirect effects of teacher competence only (Model 5b in Table 3) and in the model including a direct effect of teachers' PID skills on students' learning progression (Model 6b in Table 3). In the latter case, the indirect effects of teachers' knowledge via their skills and the total effects of the three facets of teacher competence were no longer significant either.

Discussion
The purpose of the present study was to model the relation between teacher knowledge and students' learning progression as an effect chain with mediating processes. Based on Blömeke et al.'s (2015) conceptual framework, these mediating processes should be more proximal to how students learn than teachers' knowledge. Thereby, we intended to refine the state of research that had previously identified only weak or no systematic relations between teacher knowledge and student progress. To achieve this, we utilized data from two German studies that provided a decent sample size for such complex modeling, namely TEDS-Instruct and TEDS-Validate (Kaiser & König, 2020).
In line with existing research, we found no direct effect of teachers' MCK or MPCK on students' learning progression in mathematics, indicating that a "black-box" model (Baumert et al., 2010, pp. 160-161) omitting potentially mediating processes was unable to explain student progress. Although many earlier studies had sought to establish such a distal influence, this result is unsurprising from a cognitive-psychological perspective (Anderson, 1983). The more Table 1 Basic two-level random intercept models (standardized coefficients, standard errors). Note. MCK = mathematics content knowledge, MPCK = mathematics pedagogical content knowledge, T = time point. Note. MCK = mathematics content knowledge, MPCK = mathematics pedagogical content knowledge, PID = perception, interpretation, and decisionmaking, T = time point.
proximal a predictor is to an outcome, the stronger its predictive power tends to be, whereas the more distal the predictor is, the weaker the effect will be (Fishbein et al., 2001). Teachers' MCK and MPCK are relatively distal to how students learn due to their different organization, for example along the disciplinary nature of mathematics for MCK or along teaching tasks for MPCK (Shulman, 1986). It is theoretical knowledge, generalizable across different classroom situations and therefore relatively abstract. Thus, models including this type of teacher cognition solely as predictors of student achievement disregard not only classroom interaction but also the situated nature of teacher cognition. The study by Baumert et al. (2010) tried to address this challenge by comparing black-box models with direct relations of teacher knowledge to students' learning progression with mediation models that included instructional quality measures. Kelcey et al. (2019) examined the influence of different facets of instructional quality as mediator of teachers' knowledge on students' learning gains as well. These are important steps, but a differentiation of teacher knowledge and skills was still lacking albeit needed to be able to estimate whether proximal teacher competence measures are superior in predicting instructional quality and students' learning progression.
The inclusion of teachers' PID skills--being more proximal to classroom practice--was supposed to bridge this gap in our approach extending previous studies. The data revealed that these skills appear to be sufficiently proximal enough to students' learning progression to predict it significantly, though the effect size was small. However, these skills also predicted instructional quality with a large effect size. Teachers' PID skills thus appear crucial to high-quality classroom management, student support, cognitive activation, and mathematics educational structuring. With this finding, our study provides evidence to an enduring discussion about how contextualized teacher competence measures should be (e.g., Shavelson, 2010). While the study by Kersting et al. (2012) made first efforts to analyze differential effects of knowledge (MKT scores) and skills (CVA approach) predicting student learning gains, our study provides further in-depth insights into the relevance of teachers' cognitive skills as mediators in an effect chain.
Our study revealed that teacher knowledge is relevant to student progress, thus supporting the attempts made in previous studies (e.g., Baumert et al., 2010;Hill & Chin, 2018;Hill et al., 2005;Kersting et al., 2012). First, both MCK and MPCK showed significant indirect effects of medium size on the outcome variable. Second, MPCK was a strong predictor of teachers' PID skills and MCK was a strong predictor of MPCK, both relations with large effect sizes. These results are important regarding the role of MPCK in teaching and learning because most studies examining teacher effects overlooked this dimension of teacher knowledge. Applying standardized assessment is an approach that has fairly recently been developed in empirical educational research, therefore only few studies have included such a test (e.g., Baumert et al., 2010;Hill et al., 2005). With respect to MCK, these results are important in demonstrating the need for teachers to have strong content knowledge although earlier studies had not been able to establish a link to student progress. The role of MCK in teacher education has therefore long been controversial (Wu, 2011).
Our results thus supported the effect-chain model, in particular showing that teachers' knowledge can be regarded as a precondition for teachers' PID skills. As pointed out, the proximity to student learning is according to the results of our study the decisive criterion for the placement of teacher characteristics in this effect chain; a result, which we regard as the key contribution of this study, even though an experimental design was not applied.
Our study conceptualizes in addition the relation between MCK and MPCK in line with hierarchical models of cognitive abilities. Since MCK is the broader ability, it was modelled as a precondition for MPCK. The data indeed supported an interpretation of the correlation between MPCK and MCK as an effect of MCK on MPCK. It was beyond the scope of Table 3 Relation of teacher competence to instructional quality and student progression (two-level random intercept model; standardized coefficients, standard errors). Note. MCK = mathematics content knowledge, MPCK = mathematics pedagogical content knowledge, PID = perception, interpretation, and decision-making, T = time point.
this article, but we would like to point out that several mathematically equivalent alternatives exists to model the relation of MCK and MPCK. Given our definition of overlapping content, namely that the MPCK items measure both some MCK and specific pedagogical content knowledge, the relation could be represented by a bifactor model (Blömeke et al., 2016). A competing model could hypothesize concurrent effects of MCK and MPCK on teachers' PID skills (see Fig. 2). Since the models are mathematically equivalent (Kline, 2016), we cannot decide empiricallybased on their fit to the datawhich of these approaches are correct.
Our results were mixed with respect to the relation of instructional quality to students' learning progression. A significant relation was found in the full mediation model not controlling for school type and not including a direct effect of teachers' PID skills on student progress. The effect disappeared when either one was introduced. These results revealed that the final step of the effect chain was affected by the differences between academic and non-academic tracks in the German school system, indicated by the already substantial decrease in effect size regarding the relation of student achievement between the two measurement time points when school type was controlled for in the basic models. In Germany, secondary school types--in particular the differentiation into academic versus non-academic track--constitute differential learning environments due to institutional differences (e.g., different aspiration level) and differences in students' social background composition (Maaz et al., 2008). That means that, relatively independent from the single teacher, his or her competence, and the instructional quality implemented in the classroom, just controlling for school type will explain a substantial portion of differential learning outcomes in student assessments. This apparently reduces the amount of variability in student outcomes, which could otherwise be explained by variables that are conceptually more relevant or more proximal for students' learning processes.
Relating this phenomenon to our study, the following dilemma becomes apparent: If we do not control for school type, we risk to progressively overestimate the effects of other predictors such as instructional quality on student' learning progression. If we control for school type, we risk to conservatively underestimate the effects of those other predictors. As a consequence, we look at both analysis approaches to draw appropriate conclusions. These results are in line with the study by Kelcey et al. (2019), in which the district of the school had significant influence on the relation between instructional quality and students' learning progression, making it thereby more difficult to detect the hypothesized effects.
The results revealed, on the other hand, that teachers' PID skills and instructional quality were strongly related (standardized between-level estimate = .52; see electronic supplement, Table 2), and that neither contributed more to explaining students' learning progress than the other. This is a remarkable result as it is typical for measurement that intercorrelations among constructs measured by similar types of assessments and data sources are higher than between constructs that use different kinds of assessments and data sources (in our case, externally rated observation protocol data versus video-based teacher assessment versus paper-pencil student assessment). The differences often contribute variance to scores beyond the attribute of interest (Eid et al., 2003).
Teachers' PID skills were also the only competence facet that had a direct effect on student' progression, most likely reflecting effects of third variables not included in our model. The mediation model is the most convincing model in a conceptual sense, and our empirical findings largely support this model. However, we cannot claim to have assessed and controlled for all variables potentially relevant in our context. This was impossible for practical reasons (sample size, testing time, funding available) but given the state of research also for theoretical reasons (lack of studies that examine other variables).

Limitations
Although we carefully derived our hypotheses based on existing theoretical and empirical research, the terms' direct and indirect effects and our mediator analyses may suggest causal relations. However, we could only estimate correlations. While we could control for potential selection effects by including the type of school students were attending and their language background as control variables, students were neither randomly assigned to teachers nor were teacher measures taken sequentially (Pearl, 2009;VanderWeele & Vansteelandt, 2009). This restricts the possibility of causally interpreting the results. Moreover, the temporal order of the assessments was not completely in line with the theoretical model that suggests a sequential effect chain while we assessed knowledge and skills at the same time.
The weakest part of the effect chain concerned the relation of instructional quality to students' learning progression. It is generally difficult to provide findings related to development given that prior knowledge typically explains much variation in the dependent variable. This phenomenon is even more visible on the between level than on the within level. In our study, students' learning progression was additionally assessed using different instruments at the first and second measurement time point. Although all tests are based on the national standards for mathematics education in Germany and share common items, they were operationalized differently. Therefore, we did not compare absolute achievement levels but estimated relations within a multi-level framework where students' achievement is connected to their teachers (i.e., within the same state). Such relations are typically Fig. 2. Alternative model of the relation between teachers' domain-specific knowledge, PID skills, and students' learning progression in mathematics with concurrent effects of MCK and MPCK, controlling for school type and student background (dotted lines: competing hypothesis of potential direct effects of MCK and MPCK). less prone to bias.
Another limitation is that only instructional quality could be specified as a latent variable, whereas all other constructs had to be specified as manifest variables due to limited sample size. This may have weakened the potential relations. Furthermore, since our study used data gathered in the context of educational monitoring, students' background data did not fully meet our needs. Instead of individual data, we used class composition data as a proxy measure, although such aggregation may alter a variable's meaning (Raudenbush & Bryk, 2002) and, in our context, may indicate district rather than family effects.
Finally, we must acknowledge limitations with respect to our sample. First, the teacher sample was a convenience sample, meaning we should assume self-selection bias. Second, one purpose in compiling two subsamples from different federal states was to reach a decent sample size that allowed for testing our complex models. They would ideally have been separated to test the models' robustness across different groups. However, this would have reduced the number of units on the second level.

Conclusions
Overall, our results again revealed that 'black-box' models seeking to directly relate teacher knowledge to student achievement do not work well. To identify the role of teacher knowledge, it is necessary to model the relation between the different facets of teacher competence and between these and students' learning progression in a more elaborated way. Otherwise, we risk drawing conclusions based on simplified assumptions, which could potentially harm teacher education that delivers teacher knowledge (Darling-Hammond, 2006). It is therefore recommended that future studies consider effect chains including the indirect rather than direct effects of teacher knowledge.
In this context, an intriguing conceptual and methodological research question would be, which time intervals would be appropriate to assess each of the knowledge, skills, instructional quality, and student learning variables in the model so that the study design is in line with the theoretically assumed sequence. This applies especially for measures with limited stability, such as instructional quality. Some scholars have addressed this issue similarly to the present study, namely by assessing instructional quality in several lessons with adequate generalizability to explain students' gain in achievement over an extended time period (e. g., Malmberg, Hagger, Burn, Mutton, & Colls, 2010;Meyer et al., 2011). Another approach--sometimes taken in subject-specific studies--would be to assess instructional quality depending on the content that is taught in class (e.g., the Pythagorean Theorem by Lipowsky et al., 2009). However, in this case also the achievement test would have to be adjusted to the corresponding content areas ("instructional sensitivity", see e.g., Naumann et al., 2016) and would limit the generalizability of the results.
Furthermore, it is important to think about additional potential mediators because one can argue that there is still limited validity in this respect. For instance, our study design was unable to address explicit adaptations teachers may make either during lesson preparation or on the fly. Teachers need to adjust their instructional practices to students' needs and context characteristics (Klieme, 2013;Parsons et al., 2018). It is difficult to examine such non-linear relations but to learn more about adaptive processes or additional mediators on the student side (e.g., meeting the basic needs autonomy, competence, and relatedness; Ryan & Deci, 2000) would be important for being able to design teacher education (and follow-up professional development) properly.
Our results stress the importance of teachers' PID skills, which seem closest to occurrences in the classroom and to students' learning progression. Teacher education and professional development should therefore pay particular attention to developing this facet of teacher competence. We assume that PID skills do not develop naturally but must be supported by special activities involving, for example, classroom videos that train teachers' PID skills (Santagata et al., 2021;Star & Strickland, 2008).
The results indicate that in Germany it is of key importance, whether students attend a Gymnasium or another type of school. The significant school type effect points to institutional differences, for example, regarding curricula, but also student composition effects (Maaz et al., 2008), which is similar to the district effect in the US (Kelcey et al., 2019). Future studies should therefore analyze the extent to which instructional quality may be determined by institutionalized grouping and how this may influence the teacher competence-instructional quality-student progression effect chain. The present study has opened up research perspectives that should be continued by future investigations into the field beyond using school type as a mere control variable.
We suggest that future studies include further facets of teacher competence and student outcomes as our focus in both respects was on cognitive constructs. However, affective-motivational constructs should also be examined because they are not only predictors and outcomes themselves but are instrumental in supporting students' learning progression. Several studies have examined these relations (e.g., Blazar & Kraft, 2017;Kunter et al., 2013) but covered only parts of a potential effect chain.
Finally, an important open question is whether our mediation model with its implicit causal relations also means that the different facets of teacher competence should be acquired sequentially. Having strong MCK may be regarded as a precondition for being able to have strong MPCK based on our results, and strong MPCK may in turn be regarded as a precondition for being able to have strong PID skills. The question is whether opportunities to learn MCK, MPCK, and PID skills during teacher education also need to be offered in this order or whether they can or even should be taken to some extent in parallel to each other or in some integrated version to facilitate connectedness of these cognitive facets. How teacher education should be designed, whether it should adopt an integrated perspective on developing teacher competence or develop these sequentially, has been and remains controversial (Flores, 2016;Santagata & Yeh, 2016). Evidence for either model is still lacking and needs further research.