Towards common ground in measuring acceptance of evolution and knowledge about evolution across Europe: a systematic review of the state of research

Relatively little information is available regarding the level of acceptance of evolution and knowledge about evolution in different educational settings in Europe. The aim of the present study is to fill this gap and provide a comprehensive overview of the current state of research regarding evolutionary knowledge and acceptance of students and teachers across Europe, based on a systematic literature review. We identified 56 papers for the period 2010–2020, presenting results for 29 European countries. Both knowledge and acceptance of evolution were assessed in 17 studies. Out of 13 instruments most commonly used in the literature, five have been used in the European context so far: ACORNS, CINS, I-SEA, KEE and MATE. Thirty-one other instruments were identified of which 16 were used in studies on knowledge and 15 in studies on acceptance. The extent of knowledge was hard to compare even within groups of the same education level due to the application of different instruments and assessment of different key concepts. Our results illustrate the persistence of misconceptions through all education levels. Comparing acceptance among different education levels and countries revealed a high diversity. However, a lack of evolution in curricula tended to be associated with rejection of evolution in some countries. European studies that investigated both acceptance of evolution and knowledge about evolution varied highly concerning the existence and strength of the relationship between these factors. However, some trends are visible, such as an increase of strength of the relationship the higher the education level. The present review highlights the lack of a standardized assessment of evolutionary knowledge and acceptance of evolution across Europe and, therefore, of reasonably comparable data. Moreover, the review revealed that only about one-third of all studies on acceptance and/or knowledge about evolution provided evidence for local validity and reliability. We suggest the use of assessment categories for both knowledge and acceptance instruments to allow for interpretation and comparison of sum scores among different sample groups. This, along with prospective comparative research based on similar samples, paves the way for future research aimed at overcoming current biases and inconsistencies in results.


Background and aim of the paper
Evolution is the backbone of modern biological studies as it provides the unifying framework within which all biologists, from a diversity of branches and subdisciplines, ask questions about the living world. A basic understanding of central evolutionary concepts is thus considered essential for biological education and scientific literacy. The Council of Europe (COE) and different scientific organizations within Europe have underlined the importance of promoting the teaching of evolution in school curricula as a fundamental scientific theory and have opposed teaching creationism on an equal footing, as claiming scientific respectability (e.g., COE, Resolution 1580 2007; Ecsite 2008; German National Academy of Sciences Leopoldina 2017). Only in the light of evolutionary knowledge, advances in medical research and the risks involved in biodiversity decline and climate change can truly be comprehended. However, numerous studies provided evidence of the difficulties students (e.g., Fiedler et al. 2018;Göransson et al. 2020;Torkar and Šorgo 2020) and even teachers (e.g., Athanasiou et al. 2016;Tekkaya et al. 2012;Yates and Marek 2013) have in understanding evolution. In the past decades, scientists and educators have explored understanding of evolution across a variety of educational levels and publics, in order to identify possible causal explanations and barriers that make evolution so difficult to understand Reiss and Harms 2019;Yates and Marek 2014). The general poor understanding has been attributed to a wide variety of cognitive, epistemological, religious and emotional factors (Alters and Nelson 2002).

Misconceptions about evolution
A fundamental problem in evolution education is that many students hold remarkably high levels of misconceptions about basic evolutionary principles like natural selection, adaptation, speciation or phylogeny (Harms and Reiss 2019a). A misconception is a commonly held idea that is inconsistent with scientific understanding and that is very resistant to instruction, usually developing in early childhood as part of a very intuitive but naïve understanding of the structure of the world but which persists into adulthood, being held both by novices and experts (see Gregory 2009 for a review). These include in particular anthropomorphic misconceptions (both internal, i.e., attributing intentional, adaptive change to organisms, and external, i.e., conceiving natural selection as an intentional or conscious agent; Gregory 2009), Lamarckian misconceptions (in its precise meaning: e.g., evolutionary changes can happen due to use and disuse of organs; individuals can pass acquired traits down to their offspring; Kampourakis and Zogza 2007) and "common sense" teleological ideas (e.g., evolution is goal-directed and traits evolve in order to serve specific purposes). However, as many authors have made clear, teleological thinking comprises a wide variety of forms and not all of them are scientifically unacceptable, nor provide an obstacle for evolution didactics (González Galli and Meinardi 2011;Hammann and Nehm 2020;Kampourakis et al. 2012a, b;Kampourakis 2020).

Evolution in the European school context
A major cause of these misconceptions could be that evolution-or some major aspects of it as human evolution for example-is given little importance in some European countries' school syllabi/curricula (German National Academy of Sciences Leopoldina 2017; Pinxten et al. 2020;Quessada and Clément 2011;Reiss 2018) or is presented in an inappropriate way (Sanders and Makotsa 2016). However, although such reviews rate European curricula often as insufficient and/or inappropriate in terms of evolutionary contents, huge differences between countries are visible: For example, in Turkey, which is ranking in the lowest positions regarding the acceptance of evolution (Miller et al. 2006), Evolution was reasonably taught in schools in the early years of the Republic of Turkey (Peker et al. 2010). However, in 1985 creationism was included in the biology curriculum, overshadowing the teaching of evolution (Peker et al. 2010) until finally, in 2017, evolution was removed from high school textbooks (Genç 2018). In Greece, which is another low ranking country regarding acceptance of evolution (Miller et al. 2006), "the public educational system is very successful in totally exiling evolution education from all its 'territory' without any profound prosecution or any other similar action, for many years" (Athanasiou and Papadopoulou 2015, p. 844). It is done by positioning the chapter on evolution last in biology textbooks (therefore teachers usually lack time to teach it (Prinou et al. 2005)) or by omitting it from the high school curriculum and the university entrance exams (a situation that tends to change in recent years). Until recently, in Flanders, the Dutch speaking region of Belgium, the teaching of evolution was also largely restricted to the last weeks of the final year of general secondary education, as a separate and last chapter in the textbooks (De Schutter et al. 2005; D'Haeninck Kuschmierz et al. Evo Edu Outreach (2020) 13:18 et al. 2009). In contrast, in the Netherlands, where public acceptance of evolution is rather high according to Miller et al. (2006), evolution and natural selection are already explicitly addressed in the fourth year of general secondary education and in a more integrated manner throughout the biology curriculum of upper secondary education (Geraedt and Boersma 2006;Smith and Siegel 2004). Likewise, in France, where public acceptance of evolution is high (Miller et al. 2006), evolution is present and central in the science syllabi through all school years "starting by an initiation at the Primary School, a development in Lower Secondary School and a large deepening in the scientific section of High Schools" (Quessada and Clément 2018, p. 213). In England, where public acceptance of evolution is also high (Miller et al. 2006), evolution is embedded in the secondary school curricula but also a contested topic (Reiss 2018). Students in Scotland are taught about evolution from the third year of secondary education on (Downie et al. 2018). The Scottish general science curriculum covers the topics biodiversity and interdependence of living organisms before dealing with natural selection (Downie et al. 2018). In Germany (high acceptance of evolution; Miller et al. 2006), Switzerland (moderate ranking country regarding acceptance of evolution; Miller et al. 2006), Austria (rather low ranking compared to other European countries regarding acceptance of evolution; Miller et al. 2006), and Luxembourg (rather high acceptance of evolution; Miller et al. 2006), primary education does not address evolution (Eder et al. 2018). The situation at the secondary level is complex due to many different curricula in the German federal states and the cantons of Switzerland and different school types for lower and higher secondary education. However, Evolution is taught in all four countries once in lower and upper secondary education each. Therefore, Eder et al. (2018) stated that students who leave school in those four countries after higher secondary education should have at least basic knowledge about evolution.
Curricula and textbook analyses are hard to accomplish but could reveal gaps in evolution education. A comprehensive analysis and assessment of European curricula based on a standardized framework (Understanding Evolution. 2020) is currently in preparation (EuroScitizen COST Action (CA17127) 1 ). The "BIOHEAD-Citizen project" (Biology, health and environmental education for better citizenship) was one of the first attempts to analyze countries' curricula and included 13 European and six non-European countries (Carvalho et al. 2007). Although they did not search for the coverage of evolution but only for "human evolution" in school curricula and textbooks, they provided some very useful results such as that "the social context strongly influences the way evolution is (or is not) taught, particularly human evolution" (Carvalho et al. 2007, p. 305).

Evolution education research in Europe
To date, the majority of evolution education research has been carried out in the USA, which may be mainly explained by the particular public resistance to evolution, as the regular publication polls demonstrate (Brenan 2019). However, empirical evidence shows that population polls (e.g., Brenan 2019) more likely measure differences in religious faith than in acceptance of evolution (Beniermann 2019;McCain and Kampourakis 2018).
The situation in Europe is much more diverse, as the more fragmented education research communities, different educational systems and languages make it challenging to gather comparable data for different European countries. Comparable data sets of European countries are very rare (but see e.g., Clément 2015a; Pinxten et al. 2020;Šorgo et al. 2014). Furthermore, a diverse science education research community may more often use national measuring instruments. In contrast to the USA, instruments usually have to be translated in order to conduct cross-country comparisons in Europe, which is a possible source of data bias. On the other hand, European countries differ significantly concerning public acceptance of evolution, national anti-evolution movements, evolutionary concepts in school curricula and biology teacher education programs, teachers' attitudes towards teaching evolution (Deniz and Borgerding 2018), teachers' acceptance of evolution (Clément 2015a) as well as the available study results about students' knowledge about different evolutionary concepts (Harms and Reiss 2019b). As a result, the various cultural backgrounds as well as different school systems within Europe can serve as a foundation for interesting research questions and hypotheses.

Relationships between knowledge, acceptance and religious faith
At present, relatively little information is available with respect to the level of acceptance and understanding in Europe, where religious beliefs generally are assumed to interfere less with attitudes towards evolution (Miller et al. 2006). But even in European samples, the relationship between attitudes towards evolution and religious faith was shown to be generally negative and mostly strong (e.g., Beniermann 2019; Graf and Soran 2010). However, religious diversity increased within the last decades, especially in Europe (differentiation within religions, migration, raising interest in alternative new age spirituality; Pollack et al. 2012;Stolz et al. 2014).
The relationship between attitudes towards evolution and knowledge about evolution, in particular, is another central issue for science education research (Dunk et al. 2019). To date there is no clear consensus in the evolution education community about the nature and the extent of this relationship (e.g., Barnes et al. 2019;Dunk et al. 2019;Glaze and Goldston 2015). The application of different measuring instruments (e.g., Barnes et al. 2019;Mead et al. 2019;) as well as the different use of terms concerning the key constructs (Konnemann et al. 2012;Smith and Siegel 2016) may be the main reasons for inconsistent results in this research area. This is a crucial issue for science education, since studies on attitudes and knowledge about evolution as well as their relationship lead to conclusions regarding the teaching of evolution (e.g., for Turkey Annaç and Bahçekapili 2012).

Measuring issues
However, to be able to investigate this relationship and to compare surveys with diverging results, the utilized measuring instruments should measure equivalent constructs. Besides this aspect of content validity, comparative investigations require appropriate evidence for validation in the local context of the single studies (AERA 2014). Since Nehm and Schonfeld (2008) raised the issue of measuring knowledge about natural selection and the subsequent debate (Anderson et al. 2010;Nehm and Schonfeld 2010), the discourse concerning measurement issues in evolution education accelerated and has been addressed continuously within the last years (e.g., Anderson et al. 2010;Barnes et al. 2019;Beniermann 2019;McCain and Kampourakis 2018;Mead et al. 2019;Novick and Catley 2012;). In the introduction to a special issue devoted to the topic of evolution assessment, Nehm and Mead (2019) have recently underlined the importance of drawing greater attention to research on the measurement and assessment of knowledge, attitudes and skills that are central to evolution education, thus calling for further research efforts in this area. In fact, multiple challenges arise in this context. First, the partly missing definitions of key constructs like attitudes, acceptance, knowledge and understanding lead to different operationalizations Konnemann et al. 2012;McCain and Kampourakis 2018). In the following, we will use the term knowledge instead of the often-used term understanding when referring to measuring instruments that focus on content knowledge. This is in accordance with Smith and Siegel (2016), who pointed out that a "student gains knowledge (via instruction, self-study, etc.) upon which she can build understanding" (Smith and Siegel 2016, p. 486). The term acceptance, hereafter, describes a positive attitude towards evolution, while a negative attitude is called rejection. Second, Barnes et al. (2019) showed how different evolution acceptance instruments can sometimes lead to diverging results regarding the level of acceptance when applied to the same population. This indicates a potential bias in research results and the related conclusions in evolution education studies using different instruments to assess acceptance of evolution. Third, it was shown that acceptance is higher for microevolution than for human evolution  as well as for evolution in general than evolution of the human mind (Beniermann 2019). Hence, the differences between these have to be considered when measuring evolution acceptance (Rughinis 2011; Kampourakis and Strasser 2015). Fourth, knowledge about evolution may be seen as a multidimensional construct and therefore results depend on the evolutionary concept that is assessed (Kuschmierz et al. 2020). In addition, given the unique and complex nature of context in evolutionary thinking and reasoning, evolution assessment tasks intended to measure knowledge and/or alternative conceptions may be characterized by heightened sensitivity to context effects. Nehm and Ha (2011) indeed showed that the specific scenarios/contexts in which students are asked to reason, evoke different types, magnitudes, and arrangements of key concepts of natural selection and alternative conceptions. However, the vast majority of evolution education studies have failed to carefully consider or control for context effects of items in assessment tasks (Son andGoldstone 2009, but see Nehm et al. 2012). Fifth, Mead et al. (2019) pointed out the importance of measurement standards for instruments measuring evolutionary knowledge and acceptance. They reviewed 13 different evolution education assessment instruments with respect to the evidence supporting their validity and reliability. Mead et al. (2019) revealed validity and reliability issues for some often-used instruments. Additionally, most instruments were validated for only one specific population. These findings indicate that it is difficult to compare the results gathered with different instruments. Another crucial point is that many studies only used parts of published instruments or modified versions, which may affect how well an instrument measures the intended construct .
Group comparisons such as between students of different grades, people from different countries or regarding the effect of different instructions are only reasonable, if comparable data is available for all groups. It is therefore important to use instruments for which there is supporting evidence to measure the same construct or ideally even the same instrument and similar target groups. Much research has been conducted in the USA with numerous instruments and target groups (Dunk et al. 2019). However, even on this database, questions about the relationship between acceptance and knowledge remain.

Objective
In recently published papers, authors emphasize the crucial importance of ongoing work to investigate the relationship between evolution acceptance and knowledge (e.g., Barnes et al. 2019;Dunk et al. 2019;Mead et al. 2019), since the assessment of these variables is a crucial issue for science education research (Dunk et al. 2019). The aim of the present article is to contribute to this ongoing challenge by providing an overview of the current state of research regarding evolutionary knowledge and attitudes of students and teachers across Europe, as these groups are of particular relevance in the context of science education. In contrast to the existing global overviews (e.g., Deniz and Borgerding 2018;Harms and Reiss 2019b), the present work aims at filling the gap in the European context, that has not been covered by any overview so far. Thus, we focus exclusively on European studies and the comparability of their research findings based on an analysis of the used measuring instruments, surveyed target groups within the field of education and provided evidence for local validity and reliability. The study results on evolutionary acceptance and/or knowledge about evolution conducted in European countries as well as the instruments used and evidence for local validity and reliability are presented on the basis of a systematic literature review. Comparisons across different European countries, target groups and instruments are evaluated. However, having the methodological shortcomings in mind, validity issues are subsequently discussed based on the literature review on evidence for local validity.

Process of literature review
To investigate how frequently commonly used instruments for measuring evolutionary knowledge and acceptance are applied across Europe, a citation search in Google Scholar was performed from February to March 2020. The citation search was conducted for all 13 instruments identified by Mead et al. (2019) as the most commonly used (see Additional file 1). Starting with the original publications of the instruments, all papers that were listed as "cited by" and written in English or in one of the authors' spoken languages (Croatian, Dutch, German, Greek, Italian, Macedonian, Serbian, and Slovenian) have been reviewed. Focusing on the current state of research, only results of the period 2010-2020 have been examined.
The surveyed European sample, the used research instrument, all relevant results regarding knowledge and acceptance of evolution as well as the correlation between these variables and correlations between acceptance and religiosity as a possible predictor were extracted. Additionally, we reported sources of evidence for validity and reliability that were provided in the identified papers. In doing so, we focused on the presentation of established measures of reliability (e.g., Cronbach's α) and internal structure as a measure of validity (e.g., Principal Component Analysis [PCA]) as well as other sources of reliability or validity in cases where the respective authors directly refer to the concepts of reliability and/or validity (e.g., expert review for content validity). We also took under consideration if the original instrument, a modified version or even only single questions were used and whether the original instrument was translated before implementation or not. In the case of pre-post intervention studies, we only took pre-test results into account. We did not include studies that focused on qualitative research (e.g., interviews). Moreover, we only included studies in which evolutionary knowledge and/or acceptance were not only control or predictor variables without results being presented in detail (e.g., mean score). A total of N = 27 papers was identified using five of the 13 commonly used instruments (ACORNS, CINS, I-SEA, KEE, MATE, see Additional file 1).
To additionally cover all results concerning knowledge and acceptance of evolution by students and teachers in Europe gathered with other instruments, we performed a supplementary keyword search in Google Scholar, similar to the keyword search Mead et al. (2019) conducted, in April and May 2020. This search was conducted with the key words "student understanding of evolution", "student knowledge of evolution", and "student acceptance of evolution", as well as "teacher understanding of evolution", "teacher knowledge of evolution", and "teacher acceptance of evolution" in Croatian, Dutch, English, German, Greek, Italian, Macedonian, Serbian, and Slovenian. A total of N = 26 additional papers, using 31 measuring instruments to assess attitudes and knowledge about evolution, different from those discussed in Mead et al. (2019), were identified. Three of these 31 other instruments were also used in multiple papers: The "Evolution Content Knowledge Test" (ECKT; Johnson 1985; modified by Rutledge and Warden 2000), the "Open Response Instrument" (ORI; Nehm and Reilly 2007and the "Knowledge About Evolution" instrument (KAEVO; Beniermann 2019; Kuschmierz et al. 2020). They were therefore added to the list of widespread instruments (see Additional file 1).

Score categories
The use of categories referring to levels of knowledge about evolution or acceptance of evolution allows to interpret and-if applicable-compare sum scores of similar sample groups gathered with different instruments (e.g., in different countries). Rutledge and Sadler (2007) defined categories of levels of acceptance for the MATE, making it easier to compare different data sets. Kuschmierz et al. (2020) also defined categories for the KAEVO 2.0. Since no categorization for the other widespread instruments was found, we recommend categories for these instruments that were used in Europe since 2010. Based on the MATE and KAEVO categories, we calculated five categories for each instrument (see Tables 1 and 2). We do not suggest categories for the ATEEK, CANS, ECT, EvoDevoCI, GeDI, MUM, EALS, and GAENE, as these instruments were not used in Europe so far.
With respect to the evolutionary knowledge instruments, these newly created categories for the CINS are in accordance with the suggestion of Anderson et al. (2010) that " […] anyone who scores 16/20 or higher on CINS understands natural selection quite well", since 16 in our scale is the mid score of the category "rather high". Additionally, these categories are in line with the suggestions of several authors who used the CINS in European countries (e.g., Annaç and Bahçekapili 2012; Buchan 2019; see Additional file 2).
For the ECKT, Rutledge and Warden (2000) reported a moderate level of knowledge about evolution, corresponding to a mean of 14.89. This is in line with our newly created categories (Table 1). Moreover, the newly created categories are consistent with suggestions of the authors who used the ECKT in Europe (e.g., Deniz and Sahin 2016;Stanisavljevic et al. 2013; see Additional file 2). Furthermore, the categories for the KAEVO 1. For the I-SEA, we did not find any suggestions for categories in the original publication, which is why we suggested categories based on the MATE. If our newly developed categories for acceptance and knowledge scores which we applied in the results section differed from the initial interpretation in the original publications (see Additional file 2 for initial interpretations), we mentioned this in a footnote.
As the ORI and the ACORNS are open-response instruments, we did not suggest assessment categories for these two instruments. However, Nehm and Reilly (2007) suggested the "Natural Selection Performance Quotient" (NSPQ) to quantify student knowledge and misconceptions.
Results on evolutionary knowledge and evolution acceptance are presented separately. Subsequently, all results on the relationship between evolutionary knowledge and acceptance are presented and compared only if a similar target group was surveyed and the same instrument was used in both studies. We did this aware of the shortcomings deriving from huge differences in terms of the provided evidence for reliability and local validity.

Education levels in Europe
Even though the type of education granting admission to the profession of teachers differs considerably between the European countries (Evagorou et al. 2015), we use the term "pre-service teachers" for all students that will become school teachers and accordingly enrolled in a teacher education program in their respective country. However, all studies focusing on pre-service teachers in the current review are referring to undergraduates.
The definition of school levels and the respective grades are also very diverse both between and even within European countries. We decided to define different school levels following the "International Standard Classification of Education (ISCED)" (European Commission 2019, see Table 3). All school levels mentioned in this paper refer to Table 3.
In 14 papers, no evidence for local validity and/or reliability was provided. Eleven of the 35 papers on acceptance of evolution provided evidence for local validity and reliability, four only for local validity and twelve only for reliability. In ten papers, no evidence for neither local validity nor reliability was provided. Based on this literature review, additional information, such as, for instance, used instrument(s), sample group(s) and origin(s) of the sample(s), on the identified studies is presented in Additional files 3, 4, and 5.

Knowledge about evolution CINS
Nine surveys in six European countries (Belgium (Flanders region), Germany, Greece, the Netherlands, Turkey, the United Kingdom) used the multiple-choice instrument CINS, designed to measure the knowledge about the following 10 underlying key concepts of natural selection: origin of variation, existence of variation (in a population), variation is inherited, differential survival, limited survival, biotic potential, limited (natural) resources, change in a population, population stability and origin of species, with two items for each concept (score: 0-20; Anderson et al. 2002). Based on the results of seven of Comparisons between university student groups revealed that knowledge about evolution increased significantly with the biology education level in biology majors (M 1st -3rd year = 11.6, M postgraduate in biology education = 14.2, and M 4th year = 15.1), compared to biology non-majors with and without biology classes (M no biology = 2.9 and M biology = 9.6) in Greece (Athanasiou and Mavrikaki 2014), biology majors compared to early childhood education and primary education pre-service teachers (M biology majors (all years) = 13.4, M early childhood = 9.7; Lazaridis et al. 2011) and different groups of pre-service teachers in Germany (M primary = 11.75, and M upper secondary = 14.74; Großschedl et al. 2018). Pinxten et al. (2020) compared CINS scores between Veterinary Sciences and Biomedical Sciences university freshmen in Belgium, having completed high level biology secondary education either in Flanders, Belgium or the Netherlands and reported that Dutch students obtained a significantly higher score (M Dutch = 14.4, M Flanders = 12.5).
In two studies on in-service teachers in two European countries (Greece and the United Kingdom), a moderate level of knowledge about natural selection was reported for secondary education in-service science teachers, and a rather high level of knowledge about natural selection for the biology teachers among them, in Greece (M total = 14.33, M biologists = 16.60; Venetis 2017). In the United Kingdom, primary and lower secondary education inservice teachers also showed a moderate level of knowledge about natural selection (M = 12.84; Buchan 2019).
A variety of misconceptions has been observed for university students in Greece (Athanasiou and Mavrikaki 2014), Flanders, Belgium (Pinxten et al. 2020), and Turkey (Tekkaya et al. 2011). Novice university students held more teleological misconceptions than advanced university students in Greece (Athanasiou and Mavrikaki 2014). Teleological misconceptions have also been found in primary and lower secondary education in-service teachers in the United Kingdom (Buchan 2019) who moreover also showed anthropomorphic and Lamarckian (soft inheritance) misconceptions. Pinxten et al. (2020) reported that the relative frequency of misconceptions elicited by the CINS was almost identical in Flemish and Dutch Veterinary and Biomedical Sciences university freshmen, with 'intention/need related to speciation' being the most common misconception in both samples.
The concept of biotic potential was very difficult to understand for university students in Greece (Athanasiou and Mavrikaki 2014; Lazaridis et al. 2011) and senior pre-service science teachers in Turkey (Tekkaya et al. 2011). By contrast, Flemish (68.5%) and Dutch (74.4%) Veterinary and Biomedical Sciences university freshmen appeared to have a good understanding of this concept (Pinxten et al. 2020). Change in a population, and origin of species were often misunderstood among novice university students in Greece (Athanasiou and Mavrikaki 2014), Veterinary Sciences and Biomedical Sciences university freshmen in Belgium (Pinxten et al. 2020), senior pre-service science teachers in Turkey (Tekkaya et al. 2011), and for primary and lower secondary education in-service teachers in the United Kingdom (Buchan 2019). According to Lazaridis et al. (2011), many biology majors, who actually scored high on the CINS, were not constant in their answers for the same concept.

ECKT
Seven studies (four on pre-, three on in-service teachers) in three European countries (Greece, Serbia, Turkey) used the ECKT (score: 0-2; Rutledge and Warden 2000) or modified versions of this multiple-choice instrument that covers the evolutionary concepts of natural selection, extinction processes, homologous structures, coevolution, analogous structures, convergent evolution, intermediate forms, adaptive radiation, speciation, evolutionary rates, the fossil record, biogeography, environmental change, genetic variability, and reproductive success. Studies revealed that the level of knowledge Kuschmierz et al. Evo Edu Outreach (2020) 13:18 about evolution among science and biology pre-service teachers is very low in both Greece (M biology = 7.63;  and Turkey (M science = 7.99; Akyol et al. 2010; M science = 8.00; Akyol et al. 2012; M biology = 8.62; Deniz and Sahin 2016). In Turkey, a modified version of the ECKT was used. Akyol et al. (2010) used the modified version by Deniz et al. (2008), while Deniz and Sahin (2016) did not specify the modifications.
In-service teachers' evolution knowledge based on the ECKT ranges from very low (early childhood education) to low (secondary education biology) in Serbia (M early childhood = 7.14, M secondary (biology ) = 11.
Early childhood-and primary education pre-service teachers showed a very low level of evolutionary knowledge in Slovenia (M = 3.02 6 ; score 0-12; Torkar and Šorgo 2020). Students of different university programs showed low knowledge about evolution (M = 5.27; score: 0-9; Beniermann 2019), while biology and non-biology students showed very low knowledge about evolution (M biology = 4.85, M non-biology = 4.31; score: 0-12; Kuschmierz et al. 2020). Despite the fact that biology and nonbiology students reached similar levels of knowledge about evolution, the two groups differed significantly from each other (Kuschmierz et al. 2020). Torkar and Šorgo (2020) used eight of twelve items of the KAEVO 2.0, Beniermann (2019) all nine items of KAEVO 1.0.
Beniermann (2019) additionally surveyed German biology teachers in practical training after graduation (in the following added to the group of in-service teachers) who showed moderate knowledge about evolution (M = 6.92; Beniermann 2019). In both German studies with the KAEVO, knowledge about evolution was compared between different educational groups and increased with age and educational level (Beniermann 2019;Kuschmierz et al. 2020).
Teleological thinking was the most frequently found misconception in the adaptation items for both the Slovenian sample and German samples. Additionally, in all samples biological fitness was difficult to understand, while a majority of all samples answered an item on heredity of phenotype changes to the direct offspring correctly. In-service biology teachers showed predominantly teleological and anthropomorphic misconceptions, while Lamarckian misconceptions were rather prominent among school students (Beniermann 2019).

KEE
One study on third-year university students in Spain used the multiple-choice instrument KEE (score: 0-10; Moore and Cotner 2009) that covers the following evolutionary concepts: natural selection, biological fitness, evolutionary change, variation. The study revealed low knowledge about evolution for chemistry, history, and English philology students (M chemistry = 5.2, M history = 4.8, M english philology = 4.4; Gefaell et al. 2020) and moderate knowledge for biology students (M = 6.5; Gefaell et al. 2020).

ORI
Three surveys in two European countries (Germany and Sweden) implemented the ORI (Nehm and Reilly 2007), an open response format instrument on natural selection. Results revealed that university students in Germany and Sweden used randomness and probability (Göransson et al. 2020;Harms and Fiedler 2019) as well as time aspects (Göransson et al. 2020) rarely and inconsistently to explain evolutionary processes. Also, students used evolutionary key concepts to explain evolutionary changes moderately (Harms and Fiedler 2019). A comparison between biology majors and pre-service biology teachers found remarkable deficits in both using randomness and probability in evolutionary contexts and evolutionary knowledge in general for pre-service biology teachers (Fiedler et al. 2017). Examples of evolutionary adaptation that include the loss of traits seemed to be more challenging for students than examples that include the gain of traits (Göransson et al. 2020).

ACORNS
Two studies (Großschedl et al. 2018;Nehm et al. 2013) in Germany used another open response instrument, the ACORNS  on natural selection and non-adaptive change. Großschedl et al. (2018) showed that German secondary education pre-service teachers used significantly more often evolutionary key concepts and significantly less often scientifically inaccurate concepts than primary education pre-service teachers when explaining scenarios in an evolutionary context. According to the authors, the gain of traits was easier to explain in animals than in plants, while for trait loss explanation was easier in plants than in animals (Großschedl et al. 2018). Nehm et al. (2013) compared pre-service biology teachers in Germany, USA, Korea, and Indonesia and found that evolutionary reasoning was similar across the different cultural contexts. Evolution in animals was significantly easier to explain than in plants. In agreement with the previously presented results from Göransson et al. (2020), examples of evolutionary adaptation that include the gain of traits seemed to be easier for students than examples that include the loss of traits (Nehm et al. 2013).

Other instruments
In addition to these repeatedly-used instruments, 16 studies with 16 other instruments on knowledge about evolution have been conducted since 2010 in nine European countries (Croatia, Czech Republic, Germany, Greece, Italy, Slovakia, Slovenia, Switzerland, Turkey). Details of the respective instruments and main results of these studies are summed up in Additional file 4.

Acceptance of evolution I-SEA
The I-SEA (score 24-120; Nadelson and Southerland 2012), a 24-item 5-point rating scale, includes three subscales on microevolution, macroevolution, and human evolution. The I-SEA was used only once in Europe (the United Kingdom; Betti et al. 2020). Based on the results, most first-year life sciences undergraduate students in the United Kingdom showed high acceptance of evolution, with lower acceptance for human as well as macro-than micro-evolution (M total = 93.12, M microevolution = 96.48, M macroevolution = 92.88, M human evolution = 92.40; score: 24-120; Betti et al. 2020). In this sample, religiosity was significantly negatively correlated to evolution acceptance, with the lowest acceptance scores for Muslim students, followed by Christians and students of other religions, and highest scores for students with no religion. Biomedical and health students showed significantly lower evolution acceptance than general biology, anthropology or zoology students (Betti et al. 2020).

MATE
The 5-point rating scale MATE includes 20 items on the processes of evolution, the available evidence of evolutionary change, the ability of evolutionary theory to explain phenomena, the evolution of humans, the age of the Earth, the independent validity of science as a way of knowing, and the current status of evolutionary theory within the scientific community (score 20-100; Rutledge and Warden 1999; Rutledge and Sadler 2007). The MATE was used in 20 studies and six European countries (Germany, Greece, Serbia, Spain, Turkey, the United Kingdom) making it the most-often used instrument for measuring evolution acceptance in Europe since 2010.  Lammert 2012) and high acceptance of evolution in the United Kingdom (no mean value, Mead et al. 2018). Strong believers showed low evolution acceptance in Germany, the influence of the denomination on acceptance was significant, with lowest acceptance scores for Muslims and highest scores for students without a denomination (Lammert 2012). Konnemann et al. (2016) reported that in Germany especially Christian Free Churchers (70.6%), but also Muslims (30.2%) showed low acceptance of evolution, positive attitudes toward the Biblical accounts of creation and a high degree of creationist belief, while unaffiliated showed the highest acceptance of evolution.
Pre-service biology teachers showed low 8 (M = 61.06 9 ; Deniz and Sahin 2016; M biology = 59.81 10 ; Irez and Bakanay 2011) to moderate acceptance (M = 65.52; Deniz 7 Sum score was calculated for the current review based on the presented data in Lammert (2012). 8 Different interpretation of Deniz and Sahin (2016): moderate acceptance. 9 Sum score was calculated for the current review based on the presented data in Deniz and Sahin (2016). 10 Sum score was calculated for the current review based on the presented data in Irez and Bakanay (2011).  Nehm et al. 2013). A significant negative correlation for evolution acceptance and religiosity was reported for pre-service biology teachers in Greece  and Turkey (Deniz et al. 2011;Deniz and Sahin 2016).
Considering pre-service teachers of different fields, these showed low acceptance of evolution 11 in Turkey (M = 57.40 12 ; Bilen and Ercan 2016). Moderate acceptance of evolution was reported for pre-service science teachers in Turkey (M = 66.40 13 ; Akyol et al. 2010). Low acceptance, with even lower acceptance for pre-service science teachers who had previously attended a course on science and nature of science than for students who had not, was reported in Turkey (M attended = 55.38, M not attended = 61.20; Yüce and Önel 2015 14 ). University students in Germany showed high acceptance of evolution for both a treatment and a control group in an interventional study (M treatment = 81.20, M control = 87.00; Konnemann et al. 2018). In accordance to that, Spanish third-year university students from different degree programs also showed high acceptance of evolution (M = 87.20; Gefaell et al. 2020).
In-service teachers' evolution acceptance reached from moderate for primary and secondary education teachers A significant negative correlation for evolution acceptance and religiosity was reported for in-service teachers teaching biology in Greece .

Other instruments
In addition to these repeatedly-used instruments, 15 studies with 15 other instruments on acceptance of evolution have been conducted since 2010 in 21 European countries (Austria, Cyprus, Denmark, Estonia, Finland, France, Georgia, Germany, Hungary, Italy, Lithuania, Malta, Poland, Portugal, Romania, Russia, Serbia, Spain, Sweden, Turkey, the United Kingdom). Details of the respective instruments and main results of these studies are summed up in Additional file 5.

Correlation between knowledge and acceptance of evolution
We identified 17 studies that reported the relationship between knowledge and acceptance of evolution in six European countries (Germany, Greece, Serbia, Spain, Turkey, the United Kingdom). German secondary education students showed a weak positive correlation between knowledge and acceptance of evolution (grade 9-11, Beniermann 2019; grade 9-10, Lammert 2012). Beniermann (2019) also found a weak positive correlation between knowledge and acceptance of evolution for university students. Likewise, pre-service teachers showed a weak positive correlation between knowledge and acceptance of evolution in Germany (Graf and Soran 2010;Großschedl et al. 2014), Turkey , and Greece . Additional studies in Germany (Großschedl et al. 2018;Nehm et al. 2013) and Turkey (Deniz and Sahin 2016) revealed a moderate positive correlation for pre-service teachers. Also, a moderate positive correlation for in-service teachers was found in Germany (Beniermann 2019 16 ), Serbia (Stanisavljevic et al. 2013) and the United Kingdom (Buchan 2019).
By contrast, some studies did not find significant correlations between knowledge and acceptance of evolution. This was the case for primary and lower secondary education students in Germany (grade 7, Beniermann 2019; grade 5-6, Fenner 2013), psychology students (Annaç and Bahçekapili 2012) and pre-service teachers in Turkey (Akyol et al. 2010;Graf and Soran 2010), third-year university students of different fields in Spain (Gefaell et al. 2020) and in-service teachers in Greece  and Turkey . 15 Sum score was calculated for the current review based on the presented data in Tekkaya et al. (2012). 16 Trainee biology teachers. 11 Different interpretation of Bilen and Ercan (2016): undecided position about evolution. 12 Sum score was calculated for the current review based on the presented data in Bilen and Ercan (2016). 13 Sum score was calculated for the current review based on the presented data in Akyol et al. (2010). 14 Sum scores were calculated for the current review based on the presented data in Yüce and Önel (2015). Kuschmierz et al. Evo Edu Outreach (2020) 13:18

Discussion
The diversity of the instruments used to assess acceptance of evolution and knowledge about evolution in Europe makes the comparison within and between countries and educational groups rather complicated or even questionable regarding its validity. Another crucial point in this regard is the often lacking evidence for local validity and reliability that was discovered in the present review (see Additional files 3, 4, and 5). Moreover, only five of the 13 most commonly used instruments ) were found to have been applied to European samples (ACORNS, CINS, I-SEA, KEE, MATE): this may be partly explained by the fact that some instruments have been only recently developed and published (as is the case for CANS and GAENE). This, along with a generally low number of studies per country across Europe (both as regards knowledge and acceptance of evolution, see Fig. 1 and 2) indicate that much more research is still needed i) to expand and diversify samples, ii) to unify already available ones and compare among them and iii) to apply standards to provide appropriate sources of evidence for reliability and validity. This way it will be possible to get a clearer picture of the European educational context and to make sound and reliable inferences on how different instructional settings impact learning.
Having these methodological limitations in mind (see paragraph on validity issues for a deepened discussion), our results show that the current state of research regarding knowledge and acceptance of evolution of students and teachers in Europe is diverse. However, there are in particular some major points of concern that emerge from our results. As we detail below, pre-service teachers show low to moderate levels of knowledge about evolution in some samples of several European countries (Turkey, Germany, Greece, Slovenia, Czech Republic, Slovakia). In some surveyed samples (Greece and Turkey), undecided attitudes or even rejection of evolution are recorded. As regards knowledge about evolution of primary education in-service teachers, scores range unsatisfyingly from very low to moderate. Teachers, and in particular biology teachers, play a key role in correcting misleading notions and conceptual schemas of evolution from the early stages of education, adjusting instruction to respond to their students' inquiries and needs. The persistence of various misconceptions through all educational stages that we found in our study must be interrogated by future research also in light of these critical aspects, along with a more detailed understanding of the educational offer about evolution across various curricula.

Knowledge about evolution School students
The level of knowledge about evolution in European school students has not been much explored yet. The present review resulted in ten publications in six European countries on the assessment of early childhood, primary and secondary education students' knowledge about evolution (Croatia and Slovenia: Kralj et al. 2018;Germany: Beniermann 2019;Fenner 2013;Jördens et al. 2016;Kuschmierz et al. 2020;Lammert 2012;Greece: Kampourakis et al. 2012a, b;Italy: Kampourakis et al. 2012a, b;Switzerland: Queloz et al. 2017), gathered with eight different instruments (KAEVO and self-developed). In summary, the data on knowledge about evolution in European school students is limited and not unified. The current state of research reveals mixed levels of knowledge about evolution for secondary education students, from very low (Beniermann 2019;Kuschmierz et al. 2020), moderate (Fenner 2013;Lammert 2012) to high (Rufo et al. 2013). Furthermore, a variety of misconceptions, predominantly teleological and Lamarckian, for primary (e.g., Kampourakis et al. 2012a, b) and secondary education students of various grades (e.g., Beniermann 2019;Fenner 2013;Fischer 2014;Jördens et al. 2016;Lammert 2012;Queloz et al. 2017) is apparent. The persistence of such misconceptions might indicate that European school curricula may not fully succeed in coping with naïve conceptual frameworks (that are known to develop at an early age). Also, the knowledge displayed by pre-service and in-service teachers plays a significant role in this regard. Critical aspects have emerged in this sense (see sections below).
The level of knowledge about evolution of European university students varies between and within the different fields of study. Knowledge about evolution was very low (Germany: English language and literature, and mathematics students, Kuschmierz et al. 2020; Turkey: psychology majors, Annaç and Bahçekapili 2012) and low (Germany: different study programs, Beniermann 2019; Spain: chemistry, history, and English philology students, Gefaell et al. 2020) in university students from different non-biology related study programs. Biologyrelated university freshmen showed low knowledge about evolution (Belgium: Pinxten et al. 2020). Biology majors showed very low (Germany: Kuschmierz et al. 2020), moderate (first-to third-year and postgraduate biology majors, Greece: Athanasiou and Mavrikaki 2014;Spain: Gefaell et al. 2020) to rather high knowledge about evolution (fourth-year biology majors, Greece: Athanasiou and Mavrikaki 2014;Lazaridis et al. 2011). The finding of Nehm and Ha (2011) for university students in the USA that examples of evolutionary adaptation including the loss of traits are more challenging than examples that include the gain of traits, was also confirmed for German university students (Göransson et al. 2020).
Misconceptions, predominantly teleological misconceptions, were also present among university students of different fields of study (Athanasiou and Mavrikaki 2014;Beniermann 2019;Kuschmierz et al. 2020;Pinxten et al. 2020). Also, some evolutionary concepts, as for example 'biotic potential' (Athanasiou and Mavrikaki 2014;Lazaridis et al. 2011), change in a population, and origin of species (Athanasiou and Mavrikaki 2014;Pinxten et al. 2020) seemed to be difficult to understand across multiple samples. Summed up, it can be stated that the knowledge about evolution increased with biology education level across different European university student samples.

Pre-service teachers
Pre-service teachers, especially future biology teachers, play a special role in terms of knowledge about evolution and are the most assessed group of university students in this regard, with numerous studies in Turkey (Akyol et al. 2010Deniz and Sahin 2016;Graf and Soran 2010;Keskin and Köse 2015;Tekkayaet al. 2011) and Germany (Fiedler et al. 2017;Graf and Soran 2010;Großschedl et al. 2018;Nehm et al. 2013). Pre-service teachers showed low to moderate knowledge in Turkey and low to rather high knowledge in Germany. In other countries, the database is very thin or no publications were found. Overall, 15 studies on pre-service teachers were discovered in six countries (Czech Republic: 1; Germany: 6; Greece: 2; Slovakia: 1; Slovenia: 2; Turkey: 7), gathered with ten different instruments (CINS, ECKT, KAEVO, ORI, ACORNS, and self-developed).
With this in mind, the results for pre-service teachers show partly alarmingly low levels of knowledge about evolution. Knowledge about evolution of pre-service teachers seems to be an issue (low to moderate knowledge about evolution or frequently occurring misconceptions) in several countries: Turkey (Akyol et al. 2010Deniz and Sahin 2016;Graf and Soran 2010;Keskin and Köse 2015;Šorgo et al. 2014;Tekkaya et al. 2011), Germany (Beniermann 2019;Fiedler et al. 2017;Graf and Soran 2010), Greece Athanasiou and Mavrikaki 2014) Slovenia (Šorgo et al. 2014;Torkar and Šorgo 2020); Czech Republic (Šorgo et al. 2014); and Slovakia (Šorgo et al. 2014). Pre-service teachers of different fields showed very low knowledge about evolution in two studies (Greece: Athanasiou and Mavrikaki 2014; Slovenia: Torkar and Šorgo 2020), low knowledge in two studies (Germany and Turkey: Graf and Soran 2010;Greece: Athanasiou and Mavrikaki 2014), and moderate knowledge in one study (Nehm et al. 2013). Studies that focused on pre-service science or pre-service biology teachers revealed a variety of knowledge about evolution, from unexpectedly very low (Greece: Turkey: Akyol et al. 2010;Akyol et al. 2012;Deniz and Sahin 2016), low (Turkey: Tekkaya et al. 2011), to moderate (Germany: Nehm et al. 2013; primary and lower secondary education, Großschedl et al. 2018), and rather high knowledge about evolution (Germany: upper secondary education, Großschedl et al. 2018). Results from open response instruments confirmed that the context effects in evolution assessment found in European university students (Göransson et al. 2020), were also present in pre-service teachers. The examples of evolutionary adaptation in animals apparently were easier to explain than examples in plants (Großschedl et al. 2018;Nehm et al. 2013). The same effect was found for examples including the gain of traits in contrast to the loss of traits (Großschedl et al. 2018;Nehm et al. 2013). These results indicate that the ratio of gain/loss and animal/ plants items in an instrument will control measurement outcome to a large degree, which should be taken into account in future standardized assessments across Europe (see also Nehm et al. 2012).
Misconceptions, predominantly teleological misconceptions, were also present among pre-service teachers (Germany: Graf and Soran 2010; Greece: Athanasiou and Mavrikaki 2014; Turkey: Keskin and Köse 2015;Tekkaya et al. 2011;Slovenia: Torkar and Šorgo 2020). In contrast to other university students, knowledge about evolution did not consistently increase with biology education level across different European pre-service teacher samples.

In-service teachers
Seven studies on in-service teachers were found in four countries (Greece: 4, Serbia: 1, Turkey: 1, the United Kingdom: 1), gathered with four different instruments (CINS, ECKT, self-developed). Very low (Greece: Athanasiou et al. 2016;Serbia: Stanisavljevic et al. 2013), low (Greece: Athanasiou et al. 2016;Prinou et al. 2011;Stasinakis and Athanasiou 2016;Serbia: Stanisavljevic et al. 2013;Turkey: Tekkaya et al. 2012)  . This illustrates the persistence of misconceptions through all education levels that is likely to affect the quality of evolution instruction offered to the various groups of students.

Cross-country studies
Five publications include samples from two or more European countries, four of them compare two countries in terms of knowledge about evolution (Croatia and Slovenia: Kralj et al. 2018; Germany and Turkey: Graf and Soran 2010; Belgium and the Netherlands: Pinxten et al. 2020) or with a focus on misconceptions (Germany and Sweden: Göransson et al. 2020; Belgium and the Netherlands: Pinxten et al. 2020). One study compares four countries regarding knowledge about evolution (Czech Republic, Slovakia, Slovenia, and Turkey: Šorgo et al. 2014).
Altogether, results of 15 different European countries on evolutionary knowledge were documented in the current review. In only three of these countries three or more publications are discovered (Germany: 11, Greece: 7, Turkey: 9; see Fig. 1). This implies that there is only few or even no information available concerning knowledge about evolution in most European countries. Thus, evolution education research in Europe should fill this gap in the future by conducting cross-country studies on a comparable target group by use of the same instrument and providing evidence for local validity.

Acceptance of evolution School students
Our review resulted in ten studies focusing on acceptance of evolution of school students that were discovered in six countries (Austria: 1, France: 1; Germany: 5, Italy: 1, Turkey: 1, the United Kingdom: 1), gathered with eight different instruments (MATE and self-developed).
Evolution acceptance in school students is rather high in three European countries (Germany: Beniermann 2019; Fenner 2013; Konnemann et al. 2016; the United Kingdom: Mead et al. 2018;Italy: Rufo et al. 2013). In three countries, studies reported moderate acceptance (Germany: Konnemann et al. 2016;Lammert 2012), mixed attitudes towards evolution (Austria: Eder et al. 2011) or even rejection (Turkey: Köse 2010) for this sampling group. The conflicting results for Germany support an issue, which has also been found in previous studies (e.g., Barnes et al. 2019;Mead et al. 2019;: besides other reasons, the application of different measuring instruments can lead to inconsistent results. Konnemann et al. (2016) used a self-developed instrument as well as the MATE, reporting moderate acceptance of evolution for the MATE and at the same time positive attitudes towards evolution for a great majority of the students (87.6%) based on the self-developed instrument. In both studies that revealed moderate acceptance (Konnemann et al. 2016;Lammert 2012), the MATE was used. Beniermann (2019) and Fenner (2013), who reported rather high acceptance, used self-developed measurement instruments.
The results show that only a few school students in Europe seem to reject evolution. Predominant rejection occurred only in one Turkish study (Köse 2010), where evolution was recently banned from textbooks (Genç 2018). Although there is only one study on Turkish school students, the results shown by Köse (2010) are in accordance with results of studies on Turkish pre-service teachers (e.g., Akyol et al. 2012;Deniz and Sahin 2016;Graf and Soran 2010).

University students
Only five studies on university students who are not pre-service teachers, were reported in four countries (Germany, Spain, Turkey, and the United Kingdom), gathered with five instruments (I-SEA and self-developed). According to the authors, in all samples surveyed students largely accept evolution (Germany: Beniermann 2019; Spain: Gefaell et al. 2020;Turkey: Annaç and Bahçekapili 2012;the United Kingdom: Betti et al. 2020;Southcott and Downie 2012). Despite the fact that this is generally good news, the explanatory power of a total of five studies is pretty low. More research on university students would be necessary to strengthen this tendency.
Furthermore, a crucial point when comparing studies using different instruments, is the categorization of the mean scores. For example, Annaç and Bahçekapili (2012) reported a "high acceptance" for a mean score that reflects a low to moderate acceptance of evolution based on the MATE scale (see Table 2). This issue displays that it is important to standardize comparative studies across countries.

Pre-service teachers
Fifteen studies on pre-service teachers' acceptance of evolution were discovered in four countries (Germany: 5, Greece: 2, Turkey: 7, the United Kingdom: 1), gathered with three different instruments (MATE and self-developed).
In contrast to the other university students, many studies have been conducted on European pre-service teachers. Additionally, the situation is more diverse than for school students and other university students. In some countries, the surveyed samples largely accept evolution (Germany: Graf and Soran 2010;Großschedl et al. 2014;Großschedl et al. 2018;Konnemann et al. 2018;Nehm et al. 2013; the United Kingdom: Arthur 2013), in some countries the surveyed samples have undecided positions or rather reject evolution (Greece: Turkey: Akyol et al. 2010Deniz et al. 2011Deniz and Sahin 2016;Graf and Soran 2010;Irez and Bakanay 2011;Bilen and Ercan 2016;Yüce and Önel 2015). These alarming results for Greece and Turkey should be investigated further, especially in view of the particularly important role of pre-service teachers in evolution education. In both countries, evolution only plays a minor role in school curricula.

In-service teachers
Seven studies on in-service teachers' acceptance of evolution were found in four countries (Greece: 1, Serbia: 1, Turkey: 1, the United Kingdom: 2), gathered with two different instruments (MATE and self-developed). In almost all of these countries, in-service teachers showed moderate (Serbia: Stanisavljevic et al. 2013;Turkey: Tekkaya et al. 2012) to high acceptance (Germany: Beniermann 2019; Greece: Athanasiou et al. 2016;Serbia: Stanisavljevic et al. 2013; the United Kingdom: Buchan 2019; Downie et al. 2018). In one study, the majority of biology teachers rejected evolution (Turkey: Köse 2010). Despite the crucial importance of in-service teachers to foster knowledge about evolution and acceptance of evolution, the amount of studies in Europe is quite low. The partly alarming results concerning pre-service teachers in the present review lead to the assumption that this issue could arise also in future studies on in-service teachers in Europe.
Comparing acceptance among different education levels, a rejection of evolution was mainly found in university students, but rather not in school students and in-service teachers (but see Köse 2010). Comparable with the topic of knowledge about evolution, the number of studies in different countries varied among European countries. Much research has been conducted in Turkey (especially for university students) and Germany. In all other countries, a sharp image of evolution acceptance is missing. Only two publications compare acceptance of evolution among European countries by means of the same instrument within comparable groups (Clément 2015a;Graf and Soran 2010).
Results of 35 different European countries on evolution acceptance were documented in this article. An amount of three or more publications are found in only four of these countries (Germany: 9, Greece: 3, and Turkey: 10, the United Kingdom: 6; see Fig. 2). Similar to evolutionary knowledge, it has been shown that there is only few or even no information available about acceptance of evolution in most European countries.

Relationship between acceptance of evolution and knowledge about evolution
European studies that investigated both acceptance of and knowledge about evolution reported very different results concerning the existence and strength of the relationship between these factors. However, some trends are visible, for example the lacking or weak correlation between acceptance and knowledge for primary and secondary school students in Germany indicating an increase of strength of the relationship the higher the educational level (Beniermann 2019;Fenner 2013;Lammert 2012). This assumption is supported by the fact that based on the same instruments (ATEVO and KAEVO) Beniermann (2019) showed an increase of the correlation coefficient from lower secondary students to in-service biology teachers. Other studies on pre-service or inservice teachers in Europe showed weak (Germany: Graf and Soran 2010; Großschedl et al. 2014;Turkey: Akyol et al. 2012;Greece: Athanasiou et al. 2012) or moderate (Germany: Großschedl et al. 2018;Nehm et al. 2013;Turkey: Deniz and Sahin 2016;Serbia: Stanisavljevic et al. 2013; the United Kingdom: Buchan 2019) positive relationships between acceptance and knowledge. Based on these results there is no effect of the used instruments visible as the mentioned studies applied either a combination of the MATE and the ECKT or utilized the MATE and the CINS. Both combinations of instruments lead to weak as well as moderate positive correlations between acceptance and knowledge.
However, in contrast to these results, there are contradicting studies reporting no significant correlation for pre-service and in-service teachers in Turkey (Akyol et al. 2010;Graf and Soran 2010;Tekkaya et al. 2012) and Greece . Except for Graf and Soran (2010) who used deviant instruments, all of these studies used a combination of ECKT and MATE to assess knowledge and acceptance. Even though the combination of ECKT and MATE for almost all non-significant correlations is noteworthy, it should be considered that for a valid comparison between combinations of instruments, these instruments should be applied to comparable or ideally the same samples.
Overall, the results emphasize the difference between knowledge about evolution and accepting evolution as two separate constructs, since there is no clear connection between these two variables visible. This once more demonstrates the importance for measuring instruments that clearly distinguish between acceptance of evolution and knowledge about evolution, as discussed in several methodological considerations (Beniermann 2019;Kahan 2015;Konnemann et al. 2012;McCain and Kampourakis 2018;Roos 2014;Smith 2010).
Based on this review, the relation between acceptance of evolution and knowledge about evolution remains open (see Barnes et al. 2019;Dunk et al. 2019) to investigation in Europe and needs a more standardized way to assess both factors allowing for a more comparable database.

Religiosity and other factors influencing acceptance of evolution
As a negative relation between religious faith and acceptance of evolution was discovered for primary and secondary education students (Eder et al. 2011;Lammert 2012), university students (Annaç and Bahçekapili 2012;Beniermann 2019;Betti et al. 2020;Graf and Soran 2010;Southcott and Downie 2012) including biology pre-service teachers Deniz et al. 2011;Deniz and Sahin 2016) as well as in-service teachers Clément et al. 2012) across European countries, the close relationship between these constructs becomes visible. However, it was shown before in the USA (McCain and Kampourakis 2018) as well as Europe (Germany, Beniermann 2019) that religious faith alone is no predictor for a rejection of evolution and a huge percentage of religious believers do accept evolution.
Acceptance of evolution differed between denominations for primary and secondary education students, as well as university students in Austria, Germany and the UK with lowest acceptance scores for Muslims (Eder et al. 2011;Fenner 2013;Lammert 2012;Southcott and Downie 2012) or Christian Free Churchers (Beniermann 2019; Konnemann et al. 2016) and highest scores for students without a denomination (Beniermann 2019;Lammert 2012;Konnemann et al. 2016). It should be emphasized that, subsamples of Muslims and Christian Free Churchers in European samples are normally very small and therefore difficult to generalize. Clément (2015a) and Clément et al. (2012) showed how in-service teachers in Europe differed concerning their acceptance of evolution depending on the predominant affiliation in the country samples. For example, Orthodox teachers in Russia showed the most creationist positions (Charles and Clément 2018) and European countries with a large share of Catholic (Poland, Malta) or Orthodox (Georgia, Romania) respondents tend to reject evolution more often (Clément 2015a). However, in their cross-country comparison Clément et al. (2012) showed that even countries with a comparable share of Orthodox teachers as members of a conservative religion (Cyprus, Georgia, Romania and Serbia) differ highly in their creationist positions (between 54% in Georgia and 11% in Serbia). Clément (2015a) concluded that the observed differences between countries are mostly related to the countries and not to the denomination: "Globally, in the less economically developed countries, teachers are more believing in God and practicing their religion, whatever is this religion, and they are more creationist and more often against a separation between science and religion" (Clément 2015a, p. 286). Although some religious affiliations are important parts of several national backgrounds, they cannot be separated from other important factors like national history, politics and economy (Clément 2015b). This "strong influence of the national socio-cultural context" (Clément et al. 2012) was also confirmed by comparison of Catholic, Protestant and Muslim teachers in different countries (Clément 2015a).
Another important path of investigation for future research within Europe consists in assessing which factors mainly influence the acceptance of evolution. Besides religiosity, conceptions on the nature of science (NOS; Smith 2010; Smith and Siegel 2004)-generally regarded as fundamental components of scientific literacy-may also play a critical role in this sense. Akyol et al. (2010) as well as Graf and Soran (2010) identified a statistically significant positive contribution of understanding of the nature of science to the acceptance of evolution among pre-service teachers. Moreover, attitudes towards science have found to be a significant predictor for acceptance of evolution for German (Graf and Soran 2010;Großschedl et al. 2014) and Turkish (Graf and Soran 2010) preservice teachers. Therefore, future studies should further explore the correlation between understanding the nature of science (in its epistemological and sociological aspects), attitudes towards science and acceptance of evolution in Europe.

Cross-country studies
Overall, in only four studies samples from more than one country were surveyed in terms of acceptance of evolution and/or knowledge about evolution (Clément 2015a;Göransson et al. 2020;Graf and Soran 2010;Pinxten et al. 2020;Šorgo et al. 2014). Even if the results of Clément (2015a) are based on several multiple-choice questions and no established measurement instrument, they show that teachers' views on evolution and religiosity are highly connected to their national socio-cultural background.
Numerous studies have been conducted in only a few countries (mainly Greece, Turkey, and Germany). Very few instruments have been used multiple times and the target groups are very diverse. Further research will be necessary to get a clear overview of the status of knowledge and acceptance of evolution among different education levels in Europe.
Summed up, a comprehensive overview of knowledge and acceptance of evolution in Europe, conducted with a comparable sample and the same high-quality instrument in each country, is still missing.

Measuring instruments
The identified instruments to measure knowledge about evolution and attitudes towards evolution in European studies focus on different aspects of the target construct. Especially the instruments that aim to measure knowledge about evolution differ concerning the evolutionary concepts they cover (e.g., KAEVO vs. CINS).
With regard to measuring acceptance of evolution, Barnes et al. (2019) already showed in a comparative analysis that different approaches in some cases lead to different results and hence different interpretations. In a German sample, Konnemann et al. (2016) also obtained diverging results based on two different measures. However, even globally there are still only few publications that investigated whether different instruments result in different conclusions about attitudes towards evolution Metzger et al. 2018;Rachmatullah et al. 2018;Romine et al. 2018;Nehm 2018, 2019) and even these comparative studies came to different conclusions. For example, Romine et al. (2018) concluded that the MATE, GAENE, and I-SEA can be considered as a single scale to measure one or two factors without losing quantitative interpretability, while Barnes et al. (2019) emphasized the partly inconsistent results based on different instruments by use of the I-SEA, GAENE, MATE and the 100-point instrument of self-defined acceptance. These inconsistent results were mostly visible for Christian and Mormon respondents. However, these differences in results occurred not for all instruments and not between all groups. The effect of different instruments was mainly visible when focusing on the effect of evolution understanding on evolution acceptance. For this relationship, evolution understanding was a better predictor, when evolution acceptance was assessed based on the MATE or the I-SEA microevolution scale. When people identified as Protestant or Mormon, measured values for acceptance of evolution differed depending on the applied instrument.
These reported inconsistent results may be partly explained by the different focus on evolution in general, microevolution, macroevolution or human evolution , since several studies in the US showed that levels of acceptance are higher for microevolution than for macroevolution or human evolution Nadelson and Hardy 2015;Nadelson and Southerland 2012). Theoretically, human evolution as well as macroevolution are in conflict with many religious beliefs, while even creationists accept microevolution to some extent (Pobiner 2016;Scott 2008). In Europe this difference was visible in the only study that used the I-SEA (Betti et al. 2020). Furthermore, one European study emphasized the lower acceptance for evolution of the human mind compared to evolution in general (Beniermann 2019).
Another crucial factor regarding the decision for one instrument to measure acceptance of evolution in Europe is the distinction between acceptance of evolution and religious belief. The framing of questions on attitudes towards evolution is of crucial importance, since the way in which the relationship of evolution, faith and creationism is presented, will influence the results of a survey (Elsdon-Baker 2015; Kampourakis and Strasser 2015). While Romine et al. (2017)argued for the US context that the inclusion of explicitly creationist views in assessments of acceptance of evolution may not be a problem, McCain and Kampourakis (2018) showed that publication polls about the acceptance of evolution lead to different results, depending on the inclusion of a statement about God in the questions about evolution. This distinction may be even more important, when investigating the relationship between acceptance of evolution and religious faith in less religious countries (Beniermann 2019), as it is the case in several European countries (Clément 2015a).
The diversity of the instruments used to assess acceptance of evolution and knowledge about evolution in Europe is one major point that makes the comparison within and between educational groups and countries rather complicated or even questionable regarding its validity. One approach to address this issue is to build categories of acceptance and knowledge levels to compare between results derived from different instruments. Most published scales do not recommend categories for interpretation of survey results, so that authors of single studies apply categories (e.g., "low knowledge", "moderate acceptance") themselves. This approach serves standardization between studies, even if our standardized categories are in some cases in conflict with interpretation of study authors.

Validity issues
In total, 26 studies in this review used their own instruments to assess acceptance or knowledge about evolution, making it more difficult to compare results between studies. In addition to studies that used previously published instruments, 31 other instruments were used to assess acceptance or knowledge about evolution in Europe since 2010. Most likely, not all of these instruments have undergone a validation procedure (e.g., based on AERA 2014). The literature review demonstrates that evidence for validity and reliability is at least often not reported in these publications: Only six of the 15 studies identified in the present review that used an own instrument to assess acceptance of evolution provided at least one source of evidence for validity of the instrument (see Additional file 5). For non-established instruments to assess knowledge about evolution nine studies reported at least one source of evidence for validity while seven studies did not provide any evidence (see Additional file 4).
However, there are even validity issues for most of the published scales , not to mention local validity for the respective studies that used these instruments (see Additional file 3). The present review showed that six of nine studies that used the CINS in a European context did not report any source of evidence for local validity of the CINS within their setting. Those who provided evidence for validity reported results for PCA (internal structure; Athanasiou and Mavrikaki 2014;Pinxten et al. 2020) or referred to an expert review (content validity; Tekkaya et al. 2011). Evidence for reliability in form of internal consistency was reported for five of the nine studies. Altogether, four of these nine studies did neither provide evidence for validity nor for reliability (Annaç and Bahçekapili 2012;Buchan 2019;Lazaridis et al. 2011;Nehm et al. 2013).
The majority of studies utilizing the ECKT did not provide any evidence for validity. Only one out of seven studies reported results for dimensionality . In four of the seven studies evidence for reliability was provided via internal consistency Athanasiou et al. , 2016Tekkaya et al. 2012). Summed up, in three studies neither evidence for validity nor for reliability was provided (Akyol et al. 2010;Deniz and Sahin 2016;Stanisavljevic et al. 2013).
Two of three studies using the KAEVO reported multiple evidence for validity (content validity, internal structure) and reliability (Beniermann 2019;Kuschmierz et al. 2020). One study did not provide any evidence neither for validity nor for reliability (Torkar and Šorgo 2020). Gefaell et al. (2020), who used the KEE, provided one source of evidence for validity (external structure) and reliability (internal consistency). One of three studies using the ORI provided evidence for validity (content validity; Göransson et al. 2020). Göransson et al. (2020) and also one additional study provided evidence for reliability (Fiedler et al. 2017). None of the two studies using the ACORNS provided evidence for validity but both studies provided evidence for reliability (Großschedl et al. 2018;Nehm et al. 2013). Betti et al. (2020) provided evidence for validity (internal structure) but not for reliability using the I-SEA. Seven of 21 studies using the MATE provided evidence for local validity via internal structure or content validity and reliability Großschedl et al. 2014;Irez and Bakanay 2011;Konnemann et al. 2016;Lammert 2012;Tekkaya et al. 2012;Yüce and Önel 2015). Almost all studies (18) provided evidence for reliability, predominantly via internal consistency (Akyol et al. 2010Athanasiou et al. , 2016Bilen and Ercan 2016;Denizet al. 2011;Deniz and Sahin 2016;Gefaell et al. 2020;Großschedl et al. 2014Großschedl et al. , 2018Irez and Bakanay 2011;Konnemann et al. 2016Konnemann et al. , 2018Lammert 2012;Mead et al. 2018;Tekkaya et al. 2012;Yüce and Önel 2015). Only three studies provided no evidence for neither reliability nor local validity (Buchan 2019;Nehm et al. 2013;Stanisavljevic et al. 2013).
The importance of providing evidence for local validity and reliability arised in the field of evolution education within the last 12 years Nehm and Schonfeld 2008;. Thus, the awareness about the necessity to provide proper evidence for local validity and reliability steadily increased over the years.
However, even studies that were published within the last 2 years are in some cases lacking evidence of local validity and reliability.
Furthermore, most published scales have been developed and validated for specific target groups, but are often used for different groups (e.g., different educational levels), even if it is questionable whether they are suitable for these groups (e.g., for MATE: Wagler and Wagler 2013). However, particularly in case of knowledge instruments, this raises the question, whether categories for interpretation of results should be adjusted when applying the same instrument for different educational levels.
To date, there are only few instruments that have been developed for multiple education levels (e.g., KAEVO and MATE).

Conclusions
The current state of research regarding knowledge and attitudes of evolution of students and teachers in the different European countries varies greatly in terms of number of publications and used instruments. Many different instruments have been used, most of the established instruments only rarely, in parts or in modified versions. Regardless of whether established instruments, self-developed or only locally distributed instruments were utilized, only about one-third of all studies on acceptance and/or knowledge about evolution provided evidence for local validity and reliability. Additionally, very few studies compared similar target groups in two or more European countries.
This situation makes it urgent that further research is needed to obtain a comprehensive overview of the state of knowledge about evolution and acceptance of evolution in the different educational settings in Europe. The available database is not sufficient to compare European countries reliably. The science education community should aim for standardized assessment of acceptance and knowledge about evolution in comparable target groups in many different European countries to address the investigation of how the various cultural backgrounds as well as different school systems within Europe may lead to differences in acceptance and understanding of evolution. In terms of acceptance, besides the national socio-cultural context and denominations, curricula seem to play a major role in this case, as a lack of evolution in curricula tended to be associated with a rejection of evolution in some countries.
Additionally, future research should also attempt to explain what underlies the worrying persistence of misconceptions through all European educational levels that our results have highlighted. Fostering conceptual change, instead of simply adding on existing knowledge, are held by some to be major goals of education (Sinatra et al. 2008). Drawing causal and comparative inferences will only be possible after a rigorous assessment of how much and how well European school curricula cover evolution (as pursued by EuroScitizen COST Action (CA17127)).
We emphasize standardized research on European evolution education settings and subsequently develop ways for not only sound investigation and proper reporting of evolutionary knowledge and acceptance of evolution, but furthermore evidence-based teaching of evolution.