Measuring teacher authenticity: Criteria students use in their perception of teacher authenticity

Abstract Authenticity is an often-heard term with respect to education. Tasks should be authentic, the learning environment should be authentic and, above all, the teacher should be authentic. Previous qualitative research has shown that there are four primary criteria that students in formal educational settings use when forming their perceptions of teacher authenticity, namely: Expertise, Passion, Unicity and Distance. This quantitative study validates these qualitative results and finds a possible variation of the original theoretical model in which there is no distinction made by students between Expertise and Passion, and the criterion of Distance is split into two new criteria: Strictness and Proximity.


PUBLIC INTEREST STATEMENT
In earlier, qualitative research, we asked high school students what-in their eyes-an authentic teacher is. Teachers were perceived as being authentic when they know what they are talking about and translate the subject matter to the students' knowledge level (Expertise). Second, they are passionate about what they teach (Passion). Third, they give students the feeling that each of them and each class is different (Uniqueness). Finally, authentic teachers are not friends with their students, but have an interest in them (Distance).
We now checked these findings via quantitative research to validate these criteria. Our analyses showed that the criteria of Expertise and Passion are so intertwined, that students need to see them both to perceive a teacher as authentic. Unicity remained, but the Distance criterion split into Proximity (being there for students but within certain boundaries) and Strictness (being more rigid and focused on the teaching process). This study helps us to understand and build better relations between students and their teachers.
social-constructivist approach, learning environments should be as authentic as possible (Gulikers, Bastiaens, & Kirschner, 2004;Petraglia, 1998). However, "just being yourself" can be one of the most complicated tasks you can give to someone, particularly if they are a teacher or a student-teacher (Gallego, 2001). This may sound bizarre, because how could one ever be someone else? But then, the main question is: What does it mean to be "yourself"? As Trilling (1974) describes, authenticity originally did not mean being true to oneself, but being true to others. The latter can be recognized in the work by Buchmann (1986) who links authenticity to morality.
Discussions on authenticity in education can also be seen in the transformative school movement in which, authentic teachers are described as: • showing consistency between values and actions; • relating to others in ways which encourage their authenticity; and • engaging in critical reflection on teaching practice (Cranton & Carusetta, 2004).
Here is being true to others and being true to oneself combined with a third element, relating to professional development.
Authenticity in education-while an often-used term and often-debated topic-seldom gets past the realm of philosophy (e.g. Kreber et al., 2007) and even less present is research in which pupils or students are involved. In a previous phenomenographic study of the concept of authenticity of teachers, De Bruyckere and Kirschner (2016) determined criteria that students use in their perception of teacher authenticity. Based on a qualitative phenomenographic analysis, they discerned four criteria, which determine authenticity in the eyes of the student, namely: Expertise, Passion, Unicity and Distance. As phenomenographic research can be influenced by the interpretations made by the researchers, triangulation of these findings with independent measures is needed (Miles & Huberman, 1994) to provide "evidence, whether convergent, inconsistent, or contradictory, such that the researcher can construct good explanations of the social phenomena from which they arise" (Mathison, 1988, p. 15).
This new study was performed to validate the qualitative findings via a quantitative survey design to answer two research questions. The first question was: (1) Can the four criteria (i.e. Expertise, Passion, Unicity and Distance) that students were thought to use in their perception of a teacher as authentic be validated?
But validating the existence of these four criteria is not enough as it could well be that there are other criteria that were not found in the original research. Thus, the second research was: (2) Can other criteria than the four earlier found criteria be discerned which students use in their perception of a teacher as authentic?

Background
De Bruyckere and Kirschner (2016) described four criteria that students use when they try to determine whether their teachers are being authentic. In other words, factors which influence their perception of teacher authenticity.
The first criterion was labelled "Expertise" and from the analysis it was apparent that students have clear conceptions about what teaching and teacher expertise should be if they are to be considered to be authentic: • Students expect to learn something from the teacher.
• A teacher is someone who deeply understands her/his teaching domain and knows how to explain the subject matter well.
Hidden in these three elements is the distinction between expert in a certain topical domain and being an expert teacher in a certain domain. When discussing expertise as criterion, the respondents clearly meant the latter.
The second criterion was labelled "Passion" and the qualitative analysis showed that students expect harmonious passion from their "authentic" teachers. Harmonious passion is when the activity of teaching is internalized by the teacher's identity and the activity of learning is freely accepted as important. This is in contrast to obsessive passion where teaching might be the only activity that allows the teacher to maintain a sense of self-worth (Carbonneau, Vallerand, Fernet, & Guay, 2008).
The third criterion-Unicity-was closely related to the idea that "no two people are the same". The premise the respondents describe is that "… if every student is different, every class group is different, and so every teacher should act differently" (De Bruyckere & Kirschner, 2016, p. 12).
The final criterion was labelled "Distance" and discusses the role of the relationship between students and teachers in the perception of teachers as being authentic. Students thought it important that teachers show an interest in them and in who they are during informal moments, but they themselves are less interested in the personal lives of their teachers. De Bruyckere and Kirschner concluded that students "… want to maintain a distance. And if this distance is bridged, it should occur in informal moments both between classes and during extra-curricular activities" (p. 13).

Research strategy
To answer the two research questions, a survey was developed and administered to a large sample of students in Flanders, the Dutch-speaking part of Belgium. This section provides a review of the development of the survey instrument in Dutch and the data collection carried out with it. The procedure for developing the survey was as follows: (1) Preliminary development of the survey about teacher authenticity based on items from both the original qualitative data-set supplemented by extra items from Prick (1983).
(2) Pilot study of the survey in a convenience sample of 42 students from the two last years of compulsory general secondary education from a Flemish public school (age 17-18, Grades 11, 12 and 13).
(3) Expert panel giving feedback on the initial survey.
The final version of the survey consisted of the following types of items (cf. Table 1).
The 75 items were formulated as 11-point Likert scale items (0 = absolutely not important for regarding a teacher as being authentic to 10 = absolutely very important for regarding a teacher as being authentic). This scale is item-specific in that rather than asking the respondent to agree or disagree, the choices represent specific answers for the item at hand (Saris, Revilla, Krosnick, & Shaeffer, 2010). Various studies have shown that item-specific scales yield more reliable results (Scherpenzeel & Saris, 1997), especially when the scale is rather large (Saris & Gallhofer, 2007). The items on job satisfaction, job content, person-oriented and topical knowledge were based on Prick (1983). The items on pedagogy were based on Sol's (2012) validated survey on teacher approaches. Also included were nine control items, allowing later data cleaning (cf. infra, Billiet & McClendon, 1998). Including these items allowed the monitoring of possible acquiescence bias (Holbrook, 2008). The final version of the survey can be found in Appendix A.

The sample
Eight hundred respondents were recruited for this survey, which corresponds to the "rule of thumb" to provide at least 10 respondents per item (Velicer & Fava, 1998). Because the original qualitative research involved students from three different educational tracks (i.e. general, technical, vocational education), the survey was also administered to students from these tracks. In other words, a stratified sample was used. To this end, 16 schools from the provinces of East-and West-Flanders were selected, from both public and catholic schools and in both rural and urban areas. Each school was requested to supply at least 50 students from grades 11 and 12 to fill in the survey during a 2-week period. Of the 16 schools, 11 responded positively. Although three schools could not deliver the requested number of students, other schools provided more students. In total exactly 1,400 surveys were filled in, 552 students from the province of West-Flanders and 878 from the province of East-Flanders (490 males, 896 females, 14 missing values). Of these 572 were in general secondary education, 498 in technical secondary education and 316 in vocational secondary education, again there were 14 missing values.
To guarantee that the sample was representative for the whole of Flanders, a number of checks were carried out to determine whether the educational tracks and age of the sample was representative for the general population. The sample proved to be a good representation of the total population. More information on the sample can be found in Appendix B.
To guarantee the quality of the data, different forms of control were performed. A first control consisted of checking missing values per questionnaire. Respondents with more than 10 blank items were excluded from the data file, which led to the exclusion of 56 (4.04%) of the respondents from the analyses.
Second, the control items were used for further data cleaning (cf. Figure 1). Each control item was a negative version of a regular item, which delivered nine dichotomous pairs of items. Nine new variables (one for each pair) were computed by adding both the original item and its dichotomous counterpart. If a respondent scored an 18 or higher (scale being 0-10) on one of these variables, this means that a student scored both the item and its opposite equally high (e.g. being honest as very important and being dishonest very important). Thus, respondents with a score ≥ 18 on one or more of those control variables were also excluded from further analyses. As a consequence 436 questionnaires were excluded (31.07%). This left 901 questionnaires. After evaluating them for missing

Concept Items (n) Scale
Expertise 10  values on crucial background variables, such as gender or track, 815 valid cases remained for the analysis. In the final sample there is a gender overrepresentation of girls (N = 547; 66%) over boys (N = 268; 34%). This is due to selection within some of the schools where the surveys were given to specific vocational tracks such as health care and wellness that traditionally consists of more female students. The sample consisted of 398 students in grade 11, 395 in grade 12 and 22 in grade 13.

Analyses
As this research is intended to validate the original four criteria found in the qualitative research, the construct validity needed to be determined (Stapleton, 1997). This was done using a confirmatory factor analysis on the survey items that were based on that research. Further, to determine whether there are other criteria different from the four, a principal component analysis was conducted on all the items in the survey, both the items based on the qualitative research and the added items based on Prick (1983) and Sol (2013).

Confirmatory factor analysis
To test whether the measures of the constructs used here are consistent with the nature of that construct (i.e. the construct validity), confirmatory factor analysis (CFA) was carried out (Jackson, Gillaspy, & Purc-Stephenson, 2009). To report the procedure and results of this CFA the authors followed the leads formulated by Schreiber, Nora, Stage, Barlow, and King (2006). As no missing values are allowed in a CFA-procedure, this problem had to be tackled first, namely via a list-wise deletion. The Little MCAR-test disclosed that the missing values were not random and thus multiple imputations were not an option. A sensitive analysis was performed and disclosed that neither gender nor tracks could explain the fact that the missing values were not random.
The analysis started with a model with four latent variables, for each of the four criteria. Expertise, Passion, Unicity and Distance were presumed to be latent and observable with all the corresponding items linked to their criterion. Based on Gaskin (2012) the model was adapted according to a better fit of the model to the data, based on modification indices and goodness-of-fit measures (Gaskin, 2012). Therefore, different variables were removed and other variables constrained, an overview of the different steps in the process is available upon request. The final result of the model is displayed in Figure 2. The description of the items in this figure are translated from the original language and shortened so that the figure could fit the page.
Based on Hair, Black, Babin, and Anderson (2010, p. 651), a number of indices were used to determine the goodness of fit of the model (see Table 2).
The model scores a good (i.e. cmin/dif = 2.583), moderate (i.e. GFI = .932 and RMSEA = .928) and traditional (i.e. CFI = .928) fit on the different given indices. Even a good fit does not mean that the model is "proven" as such, but there is a given chance that the model is correct.   It was necessary to establish convergent and discriminant validity, as well as reliability, when doing a CFA. There are a number useful measures for establishing validity and reliability: composite reliability (CR), average variance extracted (AVE), maximum shared variance (MSV) and average shared variance (ASV). Hair et al. (2010) suggested several thresholds, summarized in Table 3. Tables 4(a) and 4(b) display the different measures for the model. This means that there are reliability issues for both Unicity and Expertise, while there are both convergent and discriminant validity issues for all four factors. Gaskin (2012) explains what this means as follows: If you have convergent validity issues, then your variables do not correlate well with each other within their parent factor; i.e. the latent factor is not well explained by its observed variables. If you have discriminant validity issues, then your variables correlate more highly with variables outside their parent factor than with the variables within their parent factor; i.e. the latent factor is better explained by some other variables (from a different factor), than by its own observed variables. (n.p.) As the number of variables that remains for both Expertise and Passion are limited, further adaptation of the model is theoretically impossible. This means that, based on these data, a validation of the theoretical model via a confirmatory factor analysis is not really possible. Our first research question, namely whether the four criteria that De Bruyckere and Kirschner (2016) found to be used by students in their perception of a teacher as authentic can be validated (i.e. Expertise, Passion, Unicity and Distance) should be answered with a "no".

Principal component analysis
To answer the second research question, namely whether there are other possible criteria that students use in their perception of teachers as authentic, a more exploratory technique was needed. Principal component analysis (PCA) is such a technique (Jolliffe, 2002). PCA results in a factor matrix in which the first principal component explains the largest possible variance (i.e. accounts for as much of the variability in the data as possible) and each succeeding component in turn explains the highest remaining variance possible.
While in the CFA only the items based on the qualitative research were used, to answer the second research question the PCA was performed on all items, except the control items. First, it was necessary to check whether it was advisable to perform this analysis. The Kaiser-Meyer-Olkin measure of sampling adequacy was .911 (criterion: >.800) and was significant at <.001 on the Bartlett's test of sphericity (criterion: p < .01).
In the PCA, an eigenvalue of 2 was used as the cut-off for components based on the scree plot (Cattell, 1966; cf. Figure 3). The rule of thumb suggests an eigenvalue of 1, but as the scree plot shows there is a clear group of four components, explaining 35.282% of the total variance (cf. Table 5).
The PCA was performed with varimax rotation. For the PCA an item for a component was retained if it loaded >.40 or <−.40 and if the item didn't load between .3 and .4 or between −.3 and −.4 for another component. The PCA was also performed a second time on only the items of the four components, resulting in similar findings. It was also checked whether deleting an item could raise the alpha, but this wasn't the case for any of the components. In what follows the different components found via the PCA will be discussed.

PCA component 1: Proximity
Looking at the different items that constituted this component, it was apparent that the items with the highest loading represented a certain attitude of proximity towards the students, based on respect but within boundaries. These boundaries resulted in the teacher being considered as being fair. Based on the element of respect, it was not strange here that a teacher should also invest in her/his  teaching by not being dull and by having a personal teaching style (i.e. respecting the students by being a caring teacher). This personal element also needed to be present in the interactions with students. Promoting cooperation between students was also present in this component. The teacher who creates the right amount of proximity also stimulates a positive atmosphere, so that students can be close to each other in a working relationship with respect for each other.

PCA component 2: Live to teach
The items that constituted this component related to the teaching aspect of being an authentic teacher. As teacher who is perceived as being authentic, a teacher puts a lot of effort into her/his work, lives for the job because of a strong personal interest in the subject and because (s)he likes their topic. Because of this, the teacher has plenty of power to convince and wants students to succeed by teaching well, resulting in good results for the students.

PCA component 3: No textbook teacher
This component described a teacher who talks about her/himself, but the other items make clear that this is because, as an authentic teacher, (s)he is not limited to what is written in textbooks or official curricula. The teacher can talk about her/his own life, but also will add extra-curricular topics and things to lighten up the classroom routine. Not strictly adhering to the curriculum is also reflected in putting less emphasis on rules.

PCA component 4: Strictness
This fourth component with an eigenvalue higher than 2 consisted of only four items and was the weakest component of the PCA. This last component could not be retained, since the Cronbach alpha for this component is .59, making it less reliable. The component was named Strictness, as it emphasized strictness in being a teacher. Being rigid as a teacher and wanting to start right away after a holiday break indicates a teacher less focused on the student and more on the teaching (cf. Component 1, Proximity). At the same time, the teacher is proud of being a teacher and believes that what (s)he does is important for the student.
A PCA was performed to answer the second research question as to whether there were other possible criteria that play a role in perceiving a teacher as being authentic. The answer to this question is more nuanced than a simple yes or no. When examining the components found via the exploration using PCA, a possible variation of the original theoretical model was discovered that at first was refuted by answering the first research question. The four empirical components were not identical to the theoretical model described based on the qualitative research, but shared some common insights. The results needed to be combined with the insights learned from the CFA which was performed to answer the first research questions, which follows in the discussion. The original criterion of distance was replaced by two separate criteria: Proximity and Strictness, the latter being less reliable as a scale. At the same time two original criteria-Passion and Expertise-were so intertwined, that in the PCA they have become one criterion: Live to teach.

Conclusion and discussion
While the analysis of the data at first look did not allow the validation of the four criteria that students use in their perception of their teachers as authentic, as described by De Bruyckere and Kirschner (2016). Further analyses performed to answer the second research question gave a more nuanced view and proposes a slightly altered, alternative model in which the four criteria could be recognized.
The first thing to be noticed is that the original criterion of Distance seemingly became two different components. While in the theoretical model the criterion of Distance combined being close to students while maintaining a certain distance, in the PCA this became two distinct components, one being Proximity, the other being Strictness (although there were reliability issues with this latter component). The criterion of Proximity also involved another nuance in comparison with the original criterion of Distance, since it also included the element of being fair, although still suggesting a kind of distance between the teacher and the student. This insight can be interesting to the discussion mentioned in the introduction of this work concerning the difference between being true to others and being true to yourself, being closer to the elements described by Cranton and Carusetta (2004) and morality-being fair-described by Buchmann (1986). This also underlines the importance of authenticity in the role in the relation between teachers and learners.
The criteria Expertise and Passion seemingly disappeared in the CFA, but in fact returned in the PCA as the combined component of "Live to Teach". The items that made up this criterion described a teacher who is passionate for his/her job and topic, and wants the students to learn and succeed by teaching well. This component of "live to teach" also included the element of being a teaching expert, who is willing to put a lot of effort into building and maintaining their Expertise. It may seem odd that Expertise and Passion are so interrelated, since it is possible to imagine a teacher who is passionate but who lacks Expertise. However, one needs to bear in mind that this research was not about how a teacher should be or what a good teacher should be. This study showed that to be perceived as authentic by students, a teacher needs to both have Expertise and be passionate about their job.
The third component of "No Textbook Teacher" described a teacher who breaks out of the limitations created by curricula and textbooks. This component resembled what was described as Unicity in the original qualitative research, but while there the element of differentiation was present-the classes need to be unique, because all students are different-this element was missing in the new component of "No Textbook Teacher". This component focused more on adding extracurricular elements to the lesson (which also makes a lesson unique!), and also described a teacher who wants to go the extra mile for their students.
In conclusion, the original model based on qualitative research with the criteria of Expertise, Passion, Unicity and Distance was replaced by an adapted model that resembled the original model, but consisted of Proximity (Distance), Live to Teach (Passion + Expertise) and No Textbook Teacher (Unicity).

Limitations of the present research
This quantitative study has its limitations. The correspondents, besides the biases described, were all students from two provinces in Flanders: East-and-West Flanders. While this group resembled the origin of the correspondents in the qualitative study, from which this quantitative study wanted to find validating insights, there can always be unknown context elements that play a part in their responses. It is also hard to tell if younger students would give the same results. This is, of course, an interesting opportunity for further research.
There is at present a major discussion in both psychology and educational sciences about replication (Makel & Plucker, 2014). This study wants to be an invitation to other researchers to replicate at least the quantitative research, not per se to debunk the results, but rather to refine the insights established and to see if other students in other regions and from different age groups and cultures are similar or dissimilar.

School
Ten slotte willen we je nog vragen hoe fijn je het vindt op school.
Ik vind het meestal tof op school. 0 1 2 3 4 5 6 7 8 9 10 Ik zou graag van school veranderen. 0 1 2 3 4 5 6 7 8 9 10 The survey was administered in the provinces of both East-and West-Flanders. Table B1 describes the sample frame. The track of arts education wasn't included, which explains the difference of 3.390 students. Table A2 describes the different steps that led to the 894 valid cases used for the analyses, describing the impact of the different quality controls (QC).   Looking at gender an overrepresentation of girls (66%) over boys (34%) in the sample can be noted. This is due to selection within some of the schools where the surveys were given to e.g. technical tracks with more girls present.
Looking at the parameters of location of age, it is noticeable that the most of the respondents are 16 or 17: The aim was for students from each track to participate in the survey, which was achieved. The following crosstab in table 4 on grade, gender and form of education shows where the deviations are present: