The structure of the epistemological development in teaching learning questionnaire

New measurement of students’ learning and teaching concepts is essential for creating constructive alignment in teaching and support formative assessment to promote epistemic development. The Epistemological Development in Teaching Learning Questionnaire (EDTLQ) was developed to meet these needs. In the present study, the factor structure of EDTLQ was examined using a sample of 643 students from a Swedish University. The results show that the correlated sixfactor model fits the data the best. The result is consistent with the developmental theory that posits development as a dynamic highly correlated process varying across and within domains. There is a potential to use EDTLQ as a tool for adapting teaching to appropriate levels of understanding within different domains. The EDTLQ is one of the few measurements that can be used to assess students’ learning concepts so that education/teaching can be adapted to support students’ development of more complex levels of thinking about learning. Subjects: Assessment & Testing; Higher Education; Assessment; Assessment & Testing


Introduction
University studies are expected to trigger and encourage development in students' ways of understanding the knowledge and the world. Students' conceptions about teaching and learning are measurable manifestations of students' epistemic perspective: how students perceive and understand the world around them (Van Rossum & Hamer, 2010;. One way ABOUT THE AUTHOR The authors are an international group of interdisciplinary researchers that came together to develop and psychometrically test the EDTLQ. Hudson Golino, PhD assistant professor of quantitative methods, brings expertise in psychometrics. Rebecca Hamer, PhD senior manager Assessment Research and Design, codeveloped the epistemological model supporting the EDTLQ. Ellen Almers, PhD associate professor, education and didactics in natural sciences and sustainability, and Sofia Kjellström, PhD Professor of Quality Improvement and Leadership, adult development theories designed the EDTLQ and supported data collection as part of their research into complexity within education for sustainability in higher education.

PUBLIC INTEREST STATEMENT
The EDTLQ is one of the few tools that potentially can be used to assess students' interpretation of learning concepts so that education/teaching can be shaped to support students' development towards more complex levels of thinking about learning. If the EDTLQ is used during an educational program it can be used to assess learning and adapt the teaching approach to students' appropriate levels of understanding as well as track their development over time within different knowledge domains.
to establish higher education's success rate in fostering epistemic development would be to measure the change over time in students' conceptions of learning and teaching, as well as their interpretation of concepts such as understanding or application, all of which can be used to characterize qualitatively different learning outcomes and epistemic development. In this sense, periodic measurement of students' views on learning, good teaching, understanding, application and so on would support a process of formative assessment (Bennett, 2011) in support of epistemic development. Formative assessment here refers to the broad meaning of a process informing teachers on how to modify teaching to adapt it to the needs of their students. This includes both the original meaning of formative evaluation (Scriven, 1967) targeting at facilitating program improvement and the more recent focus on students learning instead of programs, building on the interpretation by Bloom (1969).
Furthermore, knowledge of the students' interpretations of teaching and learning is essential to match teaching content and approach with students understandings in order to create constructive alignment in educational goals and examinations (Biggs, 1996(Biggs, , 1999 and for assessment of the developmental pathways through university studies. All these uses focus on the epistemic perspective of the student. However, the epistemic perspective of the teacher also affects the achievable learning outcome, in particular in an epistemic sense. Reviewing a body of research on teacher thinking, Van Rossum and Hamer (2010) present a convincing case that teachers' conceptions of learning and teaching, as well as their views on the nature of concepts such as understanding, application, and classroom discussion affect how they shape their teaching towards either encouraging student's epistemic development or, in many cases, hampering or even discouraging this development (e.g. Lindblom-Ylänne & Lonka, 1999;Yerrick, Pedersen, & Arnason, 1998). Van Rossum and Hamer show that their six-stage developmental epistemological learning-teaching conception model (2010) [developed and expanded upon using over 1200 student narratives to date  can be used to understand and model teacher thinking (Richardson, 2012). This addresses the lack of research exploring how teachers' conceptions affect the success of efforts to trigger and encourage student epistemic growth (Van Rossum & Hamer, 2010). Assuming that the epistemic levels described in their six-stage epistemological development model reflect both student and teacher epistemic perspectives, measurements based on this model could be valuable for teachers to support self-reflection, as well as to inform how they handle the tension between teachers' and students' understandings of teaching and learning.
Existing inventories that focus on measuring how students view learning and teaching, such as the Epistemological Beliefs Questionnaire (EBQ) (Schommer, 1998) and the Inventory of Learning Styles (ILS) (Vermunt, 1996) have been linked to Van Rossum and Hamer's developmental model, but do not measure perspectives that have only been observed in qualitative data (e.g. Barzilai & Eshet-Alkalai, 2015;Baxter Magolda, 2001;. These qualitative data are often transcribed interviews or essay type answers to open-ended questions that require extensive coding and analysis. Apart from being time-consuming, both for participant and for researchers, these studies often do not examine large enough samples to capture the more elusive higher order ways of knowing, so relevant to learning in this time of super-complexity (Barnett, 2004;Van Rossum & Hamer, 2011). In (Kjellström, Golino, Hamer, Van Rossum, & Almers, 2016) we have described the process of the construction of a new questionnaire with the aim to measure personal epistemology regarding educational concepts of teaching and learning as stages in development. This new questionnaire would include scales measuring the rarer and most complex levels, extending the existing measures, as well as provide an alternative to the labor-intensive data collection and data analyses that the qualitative methods require (for a review Van Rossum & Hamer, 2010).

The epistemological development in teaching learning questionnaire
The new questionnaire investigated in this study is titled the Epistemological Development in Teaching Learning Questionnaire (EDTLQ) (Kjellström et.al, 2016). Five of the six scales of the  (2010) and expansions regarding student perceptions regarding the characteristics of a good study book Van Rossum & Hamer, 2013). Table 1 summarizes the characteristics of each epistemic position linked to the learning-teaching conception level (LC in Table 1) for these five scales.
The sixth scale of the EDTLQ was constructed based upon developmental responsibility research (Kjellström & Människa, 2005); Kjellström & Ross 2011). Table 2 presents examples of items from each of the six scales.
The analysis presented in Authors blinded (2016) demonstrated that of the six scales, four scales behaved as expected, i.e. the scales referring to discussion, application, understanding and good teaching, while the other two scales, referring to views on a good study book and the responsibility for learning scale, did not fit the theoretically expected pattern that well, meaning that expected epistemic rank order of the items was not always achieved. One posited a reason for this finding was linked to the response group, in this case, teachers. Van Rossum and Hamer (2010, chapter 9) found that teacher responses often obscured their personal epistemic beliefs and their conception of teaching and learning by referring to acceptable constructivist pedagogical literature and socially acceptable discourse, in addition to which their response may not reflect their teaching behaviors in the classroom. Säljö had found similar issues using teacher responses (Säljö, 1994(Säljö, , 1997. However, the lack of accepted discourse on what constitutes a good study book may result in responses that reflect teachers' epistemic positions outside the available discourse. This means that it is currently unclear to what extent the scale on a good study book does not follow the theoretical model. Student responses, however, have been found to be less affected by pedagogical discourse, so one way to explore the level of fit for the good study book scale is to analyze student responses. A second option would be to administer the EDTLQ to teachers in an educational environment that is generally more traditional and reproduction-oriented and establish whether the responses reflect the dominant pedagogical perspective. The current paper addresses the first option, exploring student interpretation of the EDTLQ scales. At the same time, given earlier findings (e.g. Van Rossum, 2010), it is to be expected that the most frequent student epistemic position will be less sophisticated than that found for teachers.
A confirmatory factor analysis and a Rasch analysis showed that the items of the EDTLQ form a unidimensional scale (Author blinded, 2016), implying a single latent variable (i.e. epistemological development) underlying the scales derived from Van Rossum and Hamer's learning-teaching conception model (2010). As an indicator of instrument validity, the authors examined person separation reliability and the item separation reliability, which showed satisfactory levels of separation indicating that the rated response alternatives did not need alteration. An analysis of the endorsement of the statements reflected the preferred constructivist learning-teaching environment of the response group, Swedish teachers in Higher Education.
The results of this study seemed to point towards a clear support of the unidimensional assumption underpinning a variety of influential theoretical models on epistemic development originating from a qualitative provenance (e.g. Perry, 1970;Belenky, Clinchy, Goldberg and Terule, 1986;Baxter Magolda, 2001;Kegan, 1994), but which had not been supported by quantitative attempts at measuring epistemic development (e.g. Schommer, 1998;Schommer, Calvert, Gariglietti, & Bajaj, 1997;Schommer-Aikins, 2004). However, Reise, More and Haviland (2010) present an argument that a good fit for a unidimensional developmental latent variable may not   describe in the most appropriate way the best organization of an instrument, and recommend additional exploration of factor structures. A second reason to further explore possible factorial structures lies in Van Rossum and Hamer's suggested alternative to the unidimensional development-i.e. the concept of an epistemological ecology-to explain the often-found robustness and resistance to change of student and teacher's existing epistemic perspective (Van Rossum & Hamer, 2010, pp. 187-188). In an epistemological ecology, various epistemic beliefs, conceptions or perspectives are closely linked, resulting in the observed highly correlated shifts in epistemic perspectives reflected in the developmental scales without requiring a single underlying unidimensional developmental variable. Kember (2001) proposed something similar when stating that conceptions of learning, of teaching, and of epistemological beliefs "form a consistent and logically inter-related set. A belief in one of these areas influences the other two beliefs, and all act in concert to affect learning approaches and outcomes" (Kember, 2001, p. 206). Van Rossum and Hamer pose that "[c]hange in any one of the dimensions (beliefs, conceptions etcetera) within an ecology, necessitates finding a new balance" (Van Rossum & Hamer, 2010, p. 185) and "confronting one or two elements of the set may perturb the whole set of beliefs" (Kember, 2001, p. 218), which is consistent with the embedded systemic model proposed by Schommer-Aikins (2004). The epistemic ecology concept makes it possible that a pattern which emerges in large response groups supports the assumption of an underlying unidimensional developmental trajectory, while on an individual level, respondents may display variation in the degree to which the theoretically expected correlation of the scales occurs, resulting in occasionally atypical response patterns. The latter dynamic would reflect a deeper structure where separate developmental trajectories are so closely connected that a change in one automatically triggers movement of all others, presenting the image of one developmental trajectory, but which, in reality, reflects an almost simultaneous developmental direction of a discrete set of separate-perhaps correlateddevelopmental routes. This means that the initial results reported in Authors blinded (2016) regarding the psychometric properties of the EDTLQ require further study before proposing a convincing structural explanation of the observations. This is the main focus of this paper and the analysis will focus on comparing three models that assume a single underlying explanatory variable and two models that do not. It is important to note, however, that the current paper will not focus on the structure of the EDTLQ on individuals, but on a group of individuals. Understanding the structure of epistemological development in teaching and learning at an individual level is very important, and should be the focus of future research.
In the present paper, three-factor structures investigated present a general latent variable or developmental dimension: the unidimensional model (Figure 1), the hierarchical model ( Figure 2) and the bifactor model (Figure 3). The first model specifies that the EDTLQ items are explained by a single factor. The second model, by its turn, is a hierarchical model (a second-order model), consisting of three levels: items, first-order factors and a higher order factor. According to this model, the first-order factors are correlated because they share a common cause, an overarching factor; in other words, the developmental dimension is a "second" or "higher-ordered" dimension that explains why the first-order factors (in this case the six dimensions of the questionnaire) are related (Figure 2). This model can be interpreted as a refinement of the unidimensional nature of epistemological development as found in the earlier study (Authors blinded), accommodating one general latent factor driving six specific factors.
However, Reise et al. (2010) recommend that psychological scales should be investigated using the bifactor model, especially when compared to more traditional unidimensional, correlated or higher order models. In the bifactor model, each item loads on both specific factors and a general factor, which are orthogonal to each other. In this way, bifactor models allow the measurement of a single general latent trait, while controlling the variance that emerges from additional specific factors. Thus, the bifactor model is a more appropriate way to investigate whether the items measure a single common factor, since the general and the first-order factors compete for item variance explanation. Reise et al. (2010) state that if the general factor accounts for a great amount of variance explanation, even in the presence of specific first-order factors explaining additional common variance for groups of items, then the items can be thought as reflecting a single common dimension.
The remaining two models to be examined reject the assumption of a single underlying developmental dimension and propose that the six scales are indeed separate. The fourth model explored here is a correlated trait model which stipulates that items are explained by six factors (see Figure 4) and assumes a level of covariance between the latent factors reflected in the six scales, but no common underlying variable or developmental dimension. The final model applied to the data is an uncorrelated trait model, namely an orthogonal six traits model, which differs from the former in the absence of correlation between the traits, and would support a developmental model with independent latent traits, one for each theoretical domain of the EDTLQ ( Figure 5).
In sum, the research questions addressed in this paper are (1) Can the unidimensional solution achieved on teacher responses in Authors blinded (2016) be recreated with similar levels of fit using student data?
(2) Which of the five-factor structures is the best, in terms of fit?

Data collection
Two of the authors collected the data by sending web-based surveys to students at Jönköping University in 2014. An initial e-mail was sent including a letter of invitation explaining the purpose of the study, and how to handle the questionnaire, information about respondent's voluntariness, confidentiality regarding the answers, and about the right to withdraw from answering the questionnaire at any time. The information and the survey were provided in Swedish. Three reminders were sent within two to three weeks. The study was approved by the Regional Ethical Review Board in Linköping, Sweden.

Participants
The procedure resulted in a sample of 643 students (men 15%, women = 81%, others 4%). The majority was studying at the beginning of the first semester (68%) and the rest at the end of the sixth semester (32%). The ages ranged from 18 to 52 years old (Mean = 25, SD = 7.87).

Measures
The survey comprised two smaller sections: one covering socio-demographic information (gender, age, educational level), and the other covering the EDTLQ. The EDTLQ consists of items formulated to represent a variety of developmental levels of teaching and learning concepts. The questionnaire included scales with six questions each regarding good teaching (bt), classroom discussion (di), understanding (un), application (ap), a good textbook (sb), and responsibility for learning (re). There is a Swedish and an English version created in an iterative translation process, starting in Swedish and then translated into English in a continuous process, where the items were translated back and forth several times to ensure their compatibility. Examples of scale items are presented in Table 2.  The response alternatives were constructed to represent a range of learning and teaching conceptions that could be presumed to appeal in different ways to various teachers/researchers and students, as appropriate to their level of thinking within the domain of epistemological development. The response alternatives were presented in random order. Individual statements could be used at several levels (Loevinger & Hy, 1996), so the goal was not to pick one statement at each level, but to create a range of statements with fewer at the ends of the scale. The participants were asked to rate the statements in accordance with how well or poorly the statements corresponded with the respondent's views and opinions on a 5-point ordinal scale (1 = unimportant, 5 = most important).
The items of the EDTLQ were constructed to reflect the full range of Van Rossum and Hamer's six-stage developmental model (2016), as presented in Table 3.

Data analysis
The confirmatory factor analysis was applied using the lavaan (Rosseel, 2012) package of the R statistical software (Team, 2016). The five-factor structures described earlier were estimated using the robust weighted least squares estimator (WLSMV). The figures were created using semPlot (Epskamp, 2014).
The fit of the items to the models was verified using the root-mean-square error of approximation (RMSEA), the comparative fit index (CFI) (Bentler, 1990), the normed fit index (NFI) and the nonnormed fit index (NNFI) (Bentler & Bonett, 1980). A good data fit is indicated by an RMSEA or equal to or less than .06 (Browne & Cudeck, 1993), a CFI equal to or greater than .95 (Hu & Bentler, 1999), and an NFI and NNFI greater than .90 (Bentler & Bonett, 1980). If necessary, the modification indexes will be used to loosen constraints on certain parameters in order to improve the overall model fit. To compare the models, an analysis of variance (ANOVA) was performed.

Results
Both the unidimensional model and the hierarchical model resulted in adequate levels of fit and comparable ranges and median standardized weights (see Table 4). Figure 6 shows the standardized weights of the unidimensional model, whilst Figure 7 presents the hierarchical model. The color saturation and the width of the edges correspond to the absolute factor loading and scale relative to the strongest factor loading.
The factor loadings of all scales taken together show that 9 of the 12 most easily endorsed items reflected levels 4 and 5, while 9 out of 12 items that were least likely to be endorsed reflected levels 2 and 3. These overall results, to some extent, match the pattern achieved using teacher responses. However, the patterns within each scale were less clear than those resulting from teacher responses (Authors blinded, 2016). In the Hierarchical model, the good study book scale (sb) shows a clearly smaller factor loading (0.74) than the other five, which all have factor loadings between 0.90 and 0.97. Of the two models remaining models, the correlated traits model also presented a good fit to the data (see Table 5 and Figure 8) with standardized factor loadings ranging from .14 (item = ap4) to .75 (item = bt5), with a median of .59. Further, the underlying traits for good teaching (bt), responsibility for learning, application (ap) and understanding are highly correlated, with correlations between 0.85 and 0.98. The trait for a good study book (sb) is less closely correlated to any of the other traits. Finally, the orthogonal traits model, assuming no correlation between the separate scales, did not result in a good fit to the data (see Table 5).
In order to establish which model reflects the data the best, the three models with adequate fit were compared using the scaled chi-square difference test proposed by Satorra and Bentler (2010). Of these models, the unidimensional, hierarchical and correlated six traits model, the latter is significantly different from the hierarchical and the unidimensional model. Therefore, the correlated six trait model presents the best fit to the data (see Table 5).

Discussion
A previous analysis of the EDTLQ showed an adequate fit of the unidimensional model using data from teachers (Authors blinded, 2016). In the present study, the unidimensional model also presented a good fit to the data, gathered in a group of college students. Addressing the second research question examining the best fit of proposed underlying psychometric structures, fivefactor structures were investigated: three supporting a single underlying factor and two that support multiple factors. The three models supporting a unidimensional hypothesis are a unidimensional model, a hierarchical structure with six first-order factors and a bifactor structure. The two models rejecting the unidimensional hypotheses examined here are a correlated structure with six factors and an uncorrelated structure with six factors. All models except the bifactor model converged and presented adequate fit to the data. Although the four converging models all presented adequate fit, when the scaled chi-square difference test proposed by Satorra and Bentler (2010) was used, the correlated traits model was the model showing a significantly better fit than any of the other models.
The present results can be viewed as support towards the notion of epistemological ecology, where various epistemic beliefs, conceptions or perspectives underlying a set of separate scales are closely linked, but each with a different focus or driving realization. This view does not require a single underlying unidimensional variable of epistemological development and is in accordance with the perspective of a multiple-trait construct (Kember, 2001). The idea of an underlying dimension can be misinterpreted in a widespread cultural metaphor of development as a ladder, which conceives development as a simple linear process (Fisher and Bidell, 2006). The ladder metaphor misses the richness of development and the variability across individual at a particular age or stage of development and within individuals both in terms of cross-domain and intra-domain skills (Fischer and Pruyne, 2003). A more useful metaphor for development may then be the ecology, a strongly linked constellation of many beliefs, each with a range of epistemic positions. As in biological ecologies, an epistemological ecology (Van Rossum & Hamer, 2010) is characterized by many feedback loops making an epistemic position fairly robust and less sensitive to quick change. However, when sufficient pressure is applied to one or more beliefs, perhaps by realization that the existing approach to learning is no longer appropriate and does not lead to study success, a tipping point occurs where the ecology can "snap" into a new again very closely linked constellation. The new constellation of beliefs then expresses itself in a coherent way, creating a more or less consistent profile of beliefs and perspectives on learning, good teaching, the purpose of discussion in learning, application and understanding, and so on. It may be a consequence of the relatively quick "snap" into a new constellation that relatively few examples are found for intermediate, less coherent profiles. However, Van Rossum and Hamer did include examples of a-typical profiles, or perhaps epistemic positions-in-transition (Van Rossum & Hamer, 2010, pp 391-394). Because of their possible rarity, Van Rossum and Hamer could only include quotes from students moving from level 2 to 3 (Level-three-thinking-foreseen) and moving from level 3 to 4 (Level-four-thinking-foreseen). Similarly, atypical or less coherent profiles of epistemic beliefs under pressure have been reported in more quantitative approaches to measuring student epistemic thinking (e.g. Vermetten, Vermunt and Lodewijks. 1999).
The model resulting in the best fit to the data, the correlated six-factor structure supports the interpretation of the EDTLQ as providing a set of six distinct scores, one for each theoretical domain or trait and these scores together present a dominant epistemic profile on these six scales. It is important to note that the current investigation does not focus on individuals, but on groups of individuals. In addition to more analyses to untangle possible group-wise developmental differences within the current respondent group, future research will need to concentrate on the behavior of scales towards measuring individual epistemic development. Such data collection would need to be undertaken using an intensive longitudinal design, in order to verify the structure of EDTLQ in each individual, over time. For some individuals, maybe there is only one single latent variable of epistemological development. For other individuals, a group of latent variables, separated by domain, can be found. Discovering which personal characteristics predict whether a person presents one structure or another could help us better understand how epistemological beliefs develop.
The EDTLQ has a promising potential to be used as a tool for formative assessment with the aim to shape the teaching and learning environment. When the EDTLQ as presented in this and an earlier study (Authors blinded, 2016) is applied to establish group preferences, outcomes can be used across groups of students to optimize teaching in relation to each of the six domains, helping the instructor/professor to better understand the situational factors that affect a given course. For example, if using the EDTLQ as assessment for learning, a teacher can discover that the group of students enrolled in their course on developmental psychology predominantly prefer relatively unsophisticated views on what constitutes a good study book, the purpose of discussions in the classroom, good teaching, learning, and so on, he/she can then design and prepare in-class activities that will help create a developmental path so that the students can develop more complex levels of thinking about learning. A follow-up with the EDTLQ, later on, can function as a formative assessment and inform about the effects of new in-class activities for students.
Another possible use of the EDTLQ as formative is to measure the impact of different teaching styles or strategies in the development of each domain, in order to evaluate teaching as a contribution to epistemic development. A possible approach could be to compare courses and programs and teaching methods as tools for support and scaffolding, examining to what degree they promote or even obstruct student epistemic development. In this sense, longitudinal studies can be useful as a means for pedagogical improvement by evaluating the long-term effect of teaching curriculum and methods on individual students' epistemological development during their studies. It would require further development of the test so that it can provide individual scores and the adoption of a new study focusing on the intraindividual variability of epistemological development. Future longitudinal studies can also be used to see how conceptions of teaching and learning change over time and across domains.
Future research could provide insight towards variations in the structure of the EDTLQ or lack thereof across disciplines, age groups and gender, as well as exploring the more direct predictive value of individual scores the instrument, i.e. if it can be used to predict academic outcomes, such as grades.
However, the results of Authors blinded (2016) and this study point towards the need for refinement of the items and the scales, new data collection for students and teachers in different educational environments to improve the psychometric properties of scales, as well as analysis aimed at identifying specific epistemic profiles within the response groups.