Assessing pre-service history teachers’ pedagogical content knowledge with a video survey using open-ended writing assignments and standardized rating items

This paper explores pre-service history teachers’ ability to recognize and reflect on typical situations occurring in the history classroom and to link these to students’ historical learning. Therefore, we draw on the concept of professional vision (Goodwin, 1994), which assumes that teachers need a professional knowledge base to monitor and to reason about teaching and student learning. Based on theoretical notions of teachers’ pedagogical content knowledge (PCK), we investigated history teachers’ professional vision by means of a video survey with integrated video clips, open-ended writing assignments and standardized item ratings. We collected data from 303 and 220 pre-service teachers at the beginning and at the end, respectively, of their subject-specific teacher training. The collected data open up the possibility of ‘simultaneous triangulation’ (Morse, 1991), which was used for test validation. First, we tested the reliability of the closed-ended test instrument using item response theory, in order to develop a feasible test model. Second, we investigated the validity of the test instrument by comparing test results with the findings of the open-ended writing task. In general, student teachers reached rather low test scores. They experienced difficulties in assessing classroom events in terms of their potential to support historical competencies and to evaluate the consequences for students’ learning. Findings from the open writing assignment show that student teachers commented largely on generic teaching strategies while hardly noticing student learning. In sum, the chosen methodological approaches seem to contribute to a more distinct picture of preservice teachers’ abilities to reason about history and learning.


Promoting historical thinking as a central goal of history education
In recent years, the concept of historical thinking has come to the forefront of history education (for example, Seixas and Morton, 2013;VanSledright, 2009;Wineburg, 2001). There seems to be a broad consensus that history teaching should foster students' 'historical thinking' (Seixas, 2017;Wineburg, 2001), 'historical reasoning' (Van Drie and Van Boxtel, 2008) or 'historical learning' (VanSledright, 2014). These concepts comprise similar aspects of historical thinking, such as asking historical questions, applying Assessing pre-service history teachers' pedagogical content knowledge 113 History Education Research Journal 16 (1) 2019 heuristics while working with sources (sourcing, contextualization, corroboration) and using second-order concepts or disciplinary ideas that structure the discipline (Lee, 2011: 137).
In German-speaking literature, different models of historical thinking have emerged over the last two decades. These comprise several inter-related competencies that students should acquire over time (Barricelli et al., 2012;Gautschi, 2009;Körber et al., 2007). For example, German history educationalists belonging to the FUER group (FUER-Geschichtsbewusstsein project stands for Förderung und Entwicklung eines reflektierten Geschichtsbewusstseins; or Promotion and Development of a Reflected Historical Consciousness; see Körber et al., 2007) defined four historical competencies: (1) being able to ask historical questions; (2) using methodological approaches to analyse and interpret relevant historical sources and accounts; (3) providing orientation -that is, developing the ability to reflect on information and insights about the past and to connect these to one's own life, thus orienting students towards their own identities; and (4) developing subject matter competency, which relates to history as a mental construct (for example, narrativity, constructiveness), including the ability to make use of first-and second-order historical concepts that help to structure the discipline (cf. Körber and Meyer-Hamme, 2015: 93-4;Seixas and Morton, 2013;Van Boxtel and Van Drie, 2018: 155-7).
The German competency debate (Klieme et al., 2003) and the notion of historical competency have since influenced the new curriculum for primary and secondary schools in German-speaking Switzerland. This requires teacher education to equip prospective history teachers with the professional knowledge needed to promote their students' historical competencies and historical thinking. Our video-based research project, VisuHist, investigated pre-service teachers' ability to recognize key features of history teaching with a potential to foster historical competencies and to evaluate the consequences for students' learning. This article presents the theoretical framework underlying our mixed-methods approach and the empirical findings obtained from validating the VisuHist video-survey instrument.

Professional knowledge of history teachers
The concept of teachers' professional competency rests on the notion that teaching knowledge and skills can be gained with a view to becoming a successful part of an autonomous community of practitioners (Kunter et al., 2013: 806). Thus, the pedagogical 'knowledge base', including the cognitive knowledge required to create effective teaching and learning environments, must be taught during teacher training (for example, Blömeke and Delaney, 2012).
However, describing what this knowledge base is supposed to be, is a complex undertaking. Early on, Shulman (1986: 9) distinguished between content knowledge, pedagogical knowledge and pedagogical content knowledge (PCK). He characterized PCK as 'subject matter knowledge for teaching'. Further, PCK involves both 'the ways of representing and formulating the subject that make it comprehensible to others' and 'an understanding of what makes the learning of specific topics easy or difficult' (ibid.: 9). Shulman also argued that PCK helps teachers to create lessons that advance students' subject matter understanding, to recognize students' misconceptions and epistemological beliefs, and to develop pedagogical responses that support students' learning.
Recent research on history teachers' professional knowledge points to the relevance of a profound disciplinary understanding of history as a prerequisite for promoting students' historical thinking. This implies that history teachers' content knowledge goes beyond mere knowledge of historical events and chronologies. Rather, it involves more sophisticated historiographical knowledge of how historical narratives are created and revised (Achinstein and Fogo, 2015: 47;Bain and Mirel, 2006;Waldis et al., 2014). This content knowledge contributes to selecting relevant historical content and concepts, and to designing history lessons.
To create lessons that advance historical thinking, teachers need PCK requiring a knowledge of instructional methods and media needed to organize and present history, to support historical analysis skills, and to make historical concepts and contents comprehensible to diverse groups of students (Achinstein and Fogo, 2015;McArthur Harris and Bain, 2011;Monte-Sano and Budano, 2013). Thus, one key aspect of PCK is the ability to translate subject matter into formats that are intelligible to students, including diverse forms of historical representation and learning tasks (Kanert and Resch, 2014;Resch and Seidenfuß, 2017;Monte-Sano, 2011: 261). For competencyoriented teaching, PCK includes the ability to create and apply historical questions to frame instruction or to support students to develop their own, to initiate and to model historical analysis, to facilitate classroom discussion about historical texts and artefacts (Achinstein and Fogo, 2015: 55), and to stimulate historical orientation (for example, elaborating on the contemporary relevance of a historical event).
A second essential aspect of PCK is the ability to understand students' disciplinary thinking and competency levels, as well as their ideas and misconceptions about history. Thus, PCK enables teachers to anticipate, recognize and respond to students' conceptions as articulated in their oral contributions and written work (Monte-Sano, 2011: 261). In this respect, theoretical models of historical thinking or historical competencies might help to develop formative assessment of student learning and diagnostic methods.
Over the last two decades, interest in the PCK needed for history teaching has grown and the literature on this topic has expanded. This work provides insights into how students process historical texts, employ evidence and multiple sources of historical information, and engage in historical empathy (for example, Davis et al., 2001;Lee, 2005;Reisman, 2012;Voss and Wiley, 2000). Further research has investigated the effects of teachers' professional development on students' historical thinking or reasoning (for an overview, see Van Hover and Hicks, 2018). In addition, several case studies have explored how student teachers develop the PCK needed to cultivate students' interpretative and evidence-based thinking (for example, Monte-Sano, 2011;Monte-Sano and Budano, 2013) or to assist novices in teaching historical reasoning (Achinstein and Fogo, 2015). However, little research exists on assessing PCK in larger groups of pre-service history teachers, which would allow approaching a question of the learning objectives achieved in teacher training. Consequently, there is a lack of stringent methodological approaches for capturing PCK.

The concept of 'professional vision'
The concept of professional vision (Goodwin, 1994) offers a promising approach to measuring those aspects of teacher knowledge that refer to the contextualized and situated real-world demands of history teaching. The concept has also become increasingly important in describing the initial processes of integrated knowledge acquisition within teacher education (Santagata and Guarino, 2011;Star and Strickland, 2008;Stürmer et al., 2013).
Professional vision involves two main sub-processes: (1) selective attention, also called 'noticing'; and (2) knowledge-based reasoning (Sherin and Van Es, 2009). The first sub-process refers to the close observation of practice. It includes activities for distinguishing important and unimportant features. From a professional point of view, it is important to identify situations and events in the classroom that are decisive for successful teaching (Barth, 2017). Thus, professional knowledge structures might support this process (Seidel and Thiel, 2017).
The second sub-process, knowledge-based reasoning, addresses the need to recognize students' thinking to be able to respond to their conceptions of subject matter by assigning appropriate activities. Knowledge-based reasoning involves describing, explaining and evaluating significant key features of classroom events, such as instructional goals, pedagogical strategies and class interaction, as well as predicting those features' possible impact on students' learning (Stürmer and Seidel, 2015). Crucial to describing is the ability to identify and differentiate relevant events without further judgement. Explaining means the ability to activate professional knowledge and to link it to classroom events for the purpose of reasoning about teaching and learning activities. Predicting refers to the ability 'to predict the consequences of observed events in terms of students' learning' (ibid.: 55). It implies the use of diagnostic competencies in exchange with other knowledge bases (content knowledge, pedagogical knowledge) (Michalsky, 2014).
Taking into account the situated nature of professional knowledge, the concept of professional vision serves as a basis for capturing student teachers' knowledge of competency-oriented teaching. In contrast to traditional methods of competency measurement (paper-and-pencil tests, for example), assessing professional vision with a suitable test procedure, including video clips of classroom events, may have the advantage of taking into account professional knowledge relevant to practice (Lindmeier, 2013). Moreover, videos of classroom interaction represent both domainspecific and generic aspects of instruction, thus enabling them to potentially activate knowledge in both areas. In addition, video reflection is quite well-established in teacher training in German-speaking Switzerland. Our study therefore investigates the professional vision of prospective history teachers based on videotaped excerpts from history lessons showing key features of history teaching.

Validating a video survey using between-method triangulation
To capture the professional vision of prospective history teachers, we developed a video-based instrument (hereafter referred to as video survey). This research tool combines videotaped real classroom situations with open writing assignments and standardized ratings assessed with a closed-ended item format. It thus comprises two different methodological approaches for capturing and evaluating participants' professional vision. The open writing assignments encouraged student teachers to write down their observations on the videotaped classroom situations, in particular to identify features of teaching relevant to students' learning. Rating the standardized items served to assess pre-service teachers' ability to assess those aspects of teaching that are linked to historical thinking and to estimate students' learning.
Thus, the collected data allowed us to compare the results of the different assignment tasks and to validate the research tool. Of particular interest was whether the teaching strategies operationalized in the closed items were also mentioned in the answers to the writing task. In the methodological literature, this procedure is referred to as 'between-method triangulation' (Denzin, 1978, cited by Johnson et al., 2007. We analysed the reliability of the closed-ended rating items using item-response theory, and the written comments using qualitative content analysis (Mayring, 2000). According to Morse (1991), the simultaneous use of quantitative and qualitative methods exhibiting limited interaction between the two sources during data collection, yet completed at data interpretation, is called 'simultaneous triangulation' (Johnson et al., 2007: 115).

Research questions
Guided by our interest in identifying competency-oriented aspects of PCK in prospective history teachers' professional vision, our study raises three research questions, each aimed at validating the video survey: 1) Does the developed test instrument involving standardized rating items produce a reliable and valid measure of professional vision in the pre-test? Do we obtain comparable measures in the post-test with regard to reliability and item difficulty? 2) Which aspects of students' disciplinary thinking and which elements of history teaching do student teachers recognize in an open writing assignment when confronted with selected video clips at the beginning and at the end of their history didactic courses? 3) Which findings arise with regard to the validity of the survey instrument combining the results of the open writing assignments and the standardized test scores? How do the results from the two data sources deepen our understanding of future history teachers' professional vision and the underlying PCK knowledge?

Methodology Sample and study design
Our study was situated in the context of single-phase teacher training programmes in German-speaking Switzerland preparing future history teachers for lower (Levels 7-9) and upper secondary schools (Levels 10-13). The goal was to assess the professional vision of future history teachers. Data collection took place between 2013 and 2016 in six teacher training institutions, before and after survey participants attended their history didactic courses. However, the modular structure of teacher training and university reforms led to reduced post-test and longitudinal samples. Table 1 gives an overview of the samples.

Instrument: Video survey
In a multi-step procedure, we developed and piloted an online survey involving integrated video clips, an open writing assignment, and standardized rating items (Waldis et al., 2014). The final test instrument included four video clips from history lessons filmed in lower secondary schools (Grade 9) from a previous research project (Gautschi et al., 2007). Our selection of suitable video clips lasting 10 to 12 minutes focused on classroom events with a potential to promote historical competencies. We chose situations in which teachers asked historical questions to frame instruction, made use of second-order concepts to structure time, initiated classroom and group discussion to analyse and interpret historical texts and artefacts, and stimulated historical orientation (see Table 2). Since the video excerpts came from everyday history lessons, they also enable critical analyses or formulating teaching alternatives. For each clip, the participating pre-service teachers first had to answer an open writing assignment task: 'What did you notice? Describe key features of the observed instruction based on subject, subject didactic and general didactic criteria that you consider relevant to students' learning. Try to describe the key elements of this lesson excerpt before you start evaluating it.' Next, participants were given standardized rating items, which we designed to measure the ability to discern central teaching strategies aimed at promoting historical competence (description) and to predict students' learning (prediction) following the Observer approach (Stürmer and Seidel, 2015). All items consisted of a four-point Likert scale ranging from 'strongly agree' to 'disagree'. The final test instrument contained 89 items (see Table 3 for scales and item examples).

Analysis of standardized item ratings
We compared the answers to the standardized rating items to a criterion-referenced norm derived from expert judgements. We obtained the expert judgements by following Oser et al. (2013). In a first step, teacher trainers of the participant groups tested and research team members answered the video survey individually. In a second step, we compared the answers of these experts. In case of disagreement, we discussed the potential answers in meetings and determined the definite answer. We matched student teachers' answers with the expert norm (hit/non-hit) and measured internal scale consistency using Cronbach's alpha. At this point, scales that reached a consistency above .50 for each subdimension remained in the test. Thus, explorative factor analysis showed that description and prediction of the particular scales loaded on one factor. Therefore, we could not model description and prediction as two scalesubdimensions.

Open writing assignment
We developed a content analysis to investigate participants' responses (Mayring, 2000).
To cover the topics addressed, we developed a category system applying procedures of inductive category development (see Table 4 for the final categories). Learning gains • General (e.g. the students learned a lot) • Subject-specific (e.g. the students formed a value judgement) Three research assistants were trained to code the data by a member of the research team. Inter-rater reliability (Krippendorf, 2013) was satisfactory for all categories at the initial and mid-term time points (α > .60).

Test validation of the standardized rating items
To answer our first research question, we integrated items from scales with sufficiently high reliability into a two-dimensional Rasch model using the software Stata 14.2 (www.stata.com/stata14/irt/). We analysed the difficulty and discrimination of each test item to select a consistent item pool. Items with discrimination parameters below 0.4 (Wu and Adams, 2007: 64) were excluded. We ran unidimensional model estimations for the overall scale (professional vision). The final test model included 33 items. The scale indices for the overall scale were satisfactory. The discrimination parameter of the pre-test model was 2.19 (SE=.10); it was 1.82 (SE=.14) for the posttest model. Test scales that required the assessment of the implementation of didactic principles, such as perspective taking/recognition of alterity (scale A) and asking historical questions (scale F), had to be excluded. All other scales remained in the test, but mostly only after exclusion of several items. We calculated an overall pre-and post-test score, which both summed up the 33 items. For the longitudinal sample, the mean test scores were M pre = 8.02, SD = 6.50 and M post = 7.58, SD = 7.10, which is surprisingly low. The pre/post-comparison (t-test for dependent samples) showed no significant difference. Adequately assessing competency-oriented history teaching seems to be difficult for pre-service teachers.

Results of the open writing assignments
In the longitudinal sample, 138 pre-service teachers recorded 406 entries on the observed video clips. The percentages of the noticed content categories are shown in Figure 1. Of the entries, 35 to 50 per cent contained comments on the lesson plan, the social form used and teacher-led classroom interaction. Participants also referred frequently to the media used, organizational and methodological matters, historical content and principles of history learning (> .30 per cent). Less than 12 per cent (hence a negligible amount) commented on students' learning activities and learning outcomes, with the exception, however, that student teachers commented on student participation in class discussions. Generally, participants focused on the instructional and generic aspects of teaching, but gave less attention to investigations of subjectspecific aspects of teaching and students' historical thinking. Within the content categories, we found significant differences between pre-and posttest entries for the categories 'competency models of historical thinking', 'domainspecific learning activities', 'general media use', and 'social form' (McNemar test p < .05). Thus, a slight shift towards more frequent attention to subject-specific aspects of teaching was found in the post-test. Nevertheless, generic aspects of teaching dominated student teachers' comments at both test times.

Triangulation of the results between the two data sources
Overall, student history teachers achieved rather low test scores in our standardized test, which focused on teaching strategies aimed at promoting historical competence, whereas in their commentaries on the open writing task, generic teaching aspects stood in the foreground. These findings led to the question whether participants would even notice the subject-specific teaching elements assessed with the standardized rating items without our guidance through the means of questionnaire items. This question raises the issue of ecological validity of the standardized instrument. In addition, we were interested to gain a better understanding of student teachers' ability to notice and evaluate teaching aspects relevant for students' disciplinary learning in an observation situation with few requirements. We therefore selected post-test comments of the writing task on the first (National Socialism) clip, categorized under 'competency models' and 'didactic principles', and subjected it to a closer analysis.
Results were as follows: many comments mentioned the terms 'to empathize with', 'to project oneself back into that time' or 'to adopt another perspective.' Comments often emphasized the relationship with one's own life or being able to compare former experiences with present-day ones: The topic is linked to the life-world, since the texts are about young people of a similar age, except that they lived in a different time. Students can therefore make comparisons with their own youth and their experiences, for example, in summer camps. The topic also touches on the lives of 'ordinary citizens' and not of the authorities. It gives students the opportunity to adopt a different perspective and to consider whether they would have acted the same way under the circumstances. (EES19-HJ-t2) However, the recognition of otherness is paraphrased in terms that are more general, concurrently the associated challenge of presentism is rarely addressed in our sample: It's always challenging to project ourselves back into the past. Pupils often struggle to go back in time, because they adopt the realities and values of today's world instead of immersing themselves totally in those days.
In contrast, student teachers used didactic terminology associated with historical consciousness and models of historical competencies. However, competence-specific terms were used somewhat superficially and did not contribute to better understanding the historical thinking involved: According to Pandel [see Pandel, 1987], getting students to immerse themselves in those days promotes their awareness of identity. (HH12_ HJ_t2_k1) The teacher gives pupils the following assignment: Imagine you were living in 1938, and are as old as you are now. Would you have gone along with things? The teacher asks for orientation competency to experience time. (HMF15_HJ_t2_k1) Assessing pre-service history teachers' pedagogical content knowledge 123 History Education Research Journal 16 (1) 2019 Finally, we also found entries that linked instructional activities to students' learning processes using didactic terminology to describe and evaluate teaching and learning in a well-founded way: On the whole, I consider the group assignment good and didactically well thought out. Since the students have to project themselves into young people during the Hitler period, this fosters their change of perspective and their understanding of teenagers living in Nazi regime. The assignment also personalizes the problems of people living in a dictatorship, which certainly stimulates pupils' interest. (GRL31_HJ_t2_k1) To sum up, the pre-service teachers recognized and described central teaching aspects of the National Socialism clip, albeit in a rather general and implicit manner. They may not even have been aware that they were commenting on key aspects of history teaching. In addition, many future teachers seem to lack a subject-specific language. Even if they know competence-specific terms, they do not seem to match the teaching practice shown. Thus, their professional vision seems to be rather limited or they encounter greater difficulties to apply their PCK knowledge to authentic classroom situations. Nevertheless, some survey participants recognized and elaborated on teaching aspects that were operationalized in the standardized instrument. This result suggests that the test instrument is ecologically valid in this respect.

Summary and conclusion
This study investigated a video survey designed to better grasp the professional vision of future history teachers. We validated the test instrument with a group of pre-service history teachers in German-speaking Switzerland. Rasch analysis yielded a reliable test instrument. However, during step-by-step validation, a large number of items had to be excluded. Many items proved to be too difficult to answer or the assessment of the statements was indecisive. This finding goes hand in hand with the experience that even defining an expert standard demanded much discussion, especially for scales covering 'asking historical questions' and 'stimulating historical orientation'. This result points to the need to sharpen the theoretical concepts within the teaching community, and to design models for practice in order to develop a shared understanding of innovative didactical approaches that foster these competencies.
We implemented open writing assignments to validate our standardized approach and to enrich our knowledge base about pre-service history teachers' professional vision. Quantitative content analysis shows that student teachers tend to comment on generic aspects of teaching while barely recognizing student learning. This concurs with similar results from other domains, which report that novices are capable of describing teaching events, but that their ability to accurately explain classroom situations and to predict their consequences lags behind that of experienced teachers (Stürmer and Seidel, 2015). In addition, pre-service teachers experienced difficulties in using adequate language to comment on subject-specific teaching processes. Besides this language issue, we assume that the profound pedagogical knowledge needed to reason about teaching and learning in the history classroom is lacking. Subject-specific theories and didactic concepts seem to be still a relatively unknown territory.
One further validity issue is whether the addressed teaching characteristics were visible enough for the participants. Visibility may significantly influence test scores. The selected video clips were long compared to other studies in this field (for example, Stürmer and Seidel, 2015). We chose longer video clips to give student teachers the opportunity to understand the broader context of teachers' instructional activities and students' learning. However, this might have caused an information overload, which made it difficult for novice teachers to recognize the competence-enhancing potential of teaching strategies identified by experts. Future studies need to consider whether shorter clips would not only be more suitable but would also enable closer observation of subject-specific issues, thus producing higher test scores. Additionally, the type of video material could be better adapted to learning objectives; for example, showing good teaching practices more explicitly in order to give viewers the chance to 'see' (notice) lesson segments more skilfully. In order to understand the development of teachers' ability to notice and to reason about history teaching in more detail, the effects of familiarity with video analysis and other context variables, which were also collected in the test, need to be analysed more closely.
Overall, the application of standardized rating items and open writing assignments in this study provides us with a first insight into the professional vision of student history teachers. The triangulation method applied led to validating the test instrument and will help to interpret the test results in further directions.