A systematic review of educational design research in Finnish doctoral dissertations on mathematics, science, and technology education

Since educational design research (EDR) was introduced to educational research at the beginning of the 1990s, it has gained recognition as a promising research approach that bridges the gap between research and practice in education. This paper aims to investigate how EDR has been utilised and developed and which challenges it has faced by systematically reviewing 21 Finnish EDR doctoral dissertations on mathematics, science, and technology education published between January 2000 and October 2018. The findings indicate that all dissertations yielded practical and theoretical contributions. Moreover, common EDR characteristics, including the use of educational problems in practice as a point of departure, research in real-world settings, evolution through an iterative process, development of practical interventions, and refinement of theoretical knowledge, were found in all dissertations. Most of the doctoral researchers were confronted with challenges, such as high demand for EDR with limited resources and difficulties associated with multidisciplinary teamwork. However, the dissertations were diverse in terms of research contexts, practical educational problems, research outcomes, research methodologies, scale, and collaboration. This systematic review not only enhances the understanding of the utilisation, development, and challenges of EDR but also provides implications for future EDR.


Introduction
Since educational design research (EDR) was introduced to educational research at the beginning of the 1990s, it has gained recognition as a promising research approach that bridges the gap between theoretical research and practice in education. Globally, EDR is still developing (Easterday, Lewis, & Gerber, 2017), as it is relatively young compared to other research approaches in education (Bell, 2004;Ørngreen, 2015). Over the past three decades, researchers have conducted EDR from a variety of theoretical perspectives and traditions for various purposes and contexts using different research methods (Bell, 2004;Prediger, Gravemeijer, & Confrey, 2015). While they have provided evidence supporting the usefulness of EDR, some have critiqued its limitations and challenges.
context of mathematics, science, and technology education at all levels. This lens was chosen for two reasons. First, it is likely that EDR is conducted differently in different educational fields, and therefore examining its application in specific fields may help refine the understanding of how to carry out EDR (McKenney & Reeves, 2012). Second, EDR has been adopted in a growing body of research on mathematics, science, and technology in education (Anderson & Shattuck, 2012;Prediger et al., 2015;Zheng, 2015).

Overview of EDR
In this paper, we use the term educational design research to describe a research approach that is also known as design experiments, design research, design-based research, and development (al) research. EDR uses educational problems in practice as a point of departure and seeks to develop practical solutions to improve educational practices and advance usable knowledge through iterative processes in real-world settings (McKenney & Reeves, 2019;Plomp, 2013).
The manifold studies on EDR differ in terms of goals, forms, processes, outcomes, and other aspects (e.g., Bell, 2004;Plomp, 2013;Prediger et al., 2015). In addition, scholars have defined EDR in a variety of ways. Table 1 provides examples of EDR characteristics proposed by Anderson and Shattuck (2012) ;Cobb, Confrey, diSessa, Lehrer, and Schauble (2003); Juuti and Lavonen (2006); McKenney and Reeves (2019); and Wang and Hannafin (2005). Nevertheless, there are some commonalities among the definitions: intervention in real-world settings to improve practices, evolution through iterative cycles, development of practical solutions (i.e., interventions), and refinement of theoretical knowledge.
Variants of educational design research (EDR) characteristics proposed by different scholars.

Title Characteristics of EDR Reference
Design-based research 1. Situated in real educational contexts 2. Focusing on the design and testing of interventions 3. Utilising mixed methods 4. Involving multiple iterations 5. Entailing partnership between researchers and practitioners 6. Providing design principles 7. Different from action research 8. Having a practical impact on practice Anderson and Shattuck, 2012, pp. 16-18 Crosscutting features of design experiments 1. Developing theories about the learning process and ways to facilitate that learning 2. Interventionist: bringing about educational innovation 3. Prospective and reflective 4. Iterative cycles of intervention and revision 5. Practice orientated Cobb, Confrey, diSessa, Lehrer, and Schauble, 2003, pp. 9-11 Features of the designbased research 1. Iterative process 2. Developing usable artefacts 3. Rendering novel educational knowledge Juuti and Lavonen, 2006, pp. 59-63 Features of the design research process 1. Theoretically oriented 2. Interventionist: developing solutions informed by existing knowledge, testing, and participants 3. Collaborative: working in collaboration with others 4. Responsively grounded process 5. An Iterative process of investigation, development, testing, and refinement McKenney and Reeves, 2019, pp. 12-16 Characteristics of design-based research 1. Pragmatic: refining theory and practice 2. Grounded in relevant research, theory, and practice 3. Interactive: working together with participants; an iterative cycle of analysis, design, implementation, and redesign; and flexible when necessary 4. Integrative: using mixed research methods 5. Contextual research results and generated design principles Wang and Hannafin, 2005, p. 8 Descriptions of phases of EDR differ between scholars (cf. Cobb et al., 2003;Easterday et al., 2017). According to Plomp (2013), there are three main phases: (1) preliminary research (i.e., literature research, needs and context analysis, and theoretical framework development), (2) the development phase (i.e., the iterative design phase), and (3) the assessment phase (i.e., the summative evaluation of the intervention and recommendations for improvement; cf. McKenney & Reeves, 2019, who described the initial phase, design phase, and evaluation). McKenney and Reeves (2019) divided EDR into cycles of different sizes: single subcycle, multiple subcycles, and overall design research project. A single subcycle is the completion of one of the three main phases (i.e., preliminary research, development, or assessment). Multiple subcycles consist of several subcycles, but not as many as the whole EDR project. An overall design research project can range from one multiple subcycle that consist of three subcycles of each phase to several multiple subcycles.
EDR contributes to both practice and theory. In terms of its practical contribution, EDR uses an iterative process of design, assessment, and redesign in authentic contexts to develop an intervention to solve an educational problem (McKenney & Reeves, 2019). Additionally, according to Edelson (2002), EDR can help to develop three types of theory: domain theories, design frameworks, and design methodologies (cf. Plomp, 2013). Domain theories describe real-world phenomena and the outcomes of design implementation; design frameworks describe the characteristics of successful solutions to the problem in the studied context; design methodologies provide guidelines for successfully achieving the research aims.

EDR challenges and recommendations
Scholars have addressed several challenges of EDR and provided recommendations for how to overcome them. First, the triangulation of data sources, data collection methods, data types, theories, and evaluators is recommended to better understand complex real-world phenomena and enhance the reliability and validity of EDR (e.g., Design-Based Research Collective [DBRC], 2003;McKenney & Reeves, 2019). Nevertheless, triangulation and the iterative nature of EDR usually lead to over methodologisation-that is, the collection and analysis of excessive amounts of datawhich sometimes many not lead to adequate results (e.g., Brown, 1992;Dede, 2004). Second, EDR researchers often take on multiple roles (e.g., researcher, designer, implementor, and evaluator of the intervention), which may lead to conflicts of interest (e.g., Plomp, 2013). Triangulation of researchers can enhance the objectivity of EDR (Plomp, 2013). Third, several EDR studies tend to be under conceptualised, as they lack a profound theoretical foundation and do not seek to provide theoretical contributions (e.g., Dede, 2004). Therefore, EDR should not only provide solutions to problems but also yield a variety of theories, particularly theories related to the design process (McKenney & Reeves, 2019). Fourth, a multidisciplinary collaboration among various experts from relevant fields is recommended for ensuring the feasible and successful development of solutions to complex educational problems (e.g., Wang & Hannafin, 2005). However, multidisciplinary teamwork requires, for example, a shared understanding among team members, strong group cohesion, and respect for others, and thus teamwork can be tiresome and contentious (McKenney & Reeves 2019). Fifth, the involvement of various participant groups that are relevant to the implementation of the intervention (e.g., teachers, students, and organisations) is advised to better understand complex authentic contexts and enhance respondent triangulation (McKenney & Reeves, 2019;Ørngreen, 2015). Sixth, rather than refining only one design idea, working with alternative designs and exploring solutions is recommended to ensure that the proposed intervention is the best solution to the problem (McKenney & Reeves, 2019;Ørngreen, 2015). Finally, Kelly (2013) proposed that, as EDR requires the investment of considerable resources, EDR should be employed only when truly needed, such as when facing a challenging educational problem with no satisfactory solution.

Previous reviews of EDR
Previous studies have investigated the utilisation and progress of EDR and other relevant issues with various focuses and review processes. Anderson and Shattuck (2012) reviewed and defined the characteristics of EDR, including interventions in real educational contexts, a focus on the design and testing of a significant intervention, the use of mixed methods, multiple iterations, a collaborative partnership between the researcher and practitioners, the provision of design principles, differences from action research, and practical impact. The authors also conducted a review of the 47 most cited EDR articles from 2002 to 2011. Quantitative and qualitative content analyses were conducted to investigate geographic, disciplinary, and curricular focuses and the interventions, iterations, and outcomes of the articles. They found that design research was increasingly employed in educational contexts and that the majority of studies were conducted in North America. The most commonly studied subject was science; the main context was K-12; and most interventions involved technology. Thirty-one articles were empirical studies that were part of a multi-iterative research project. All of the empirical studies involved were either technological and instructional design interventions or instructional methods, models, and strategies. Typically, mixed methods were employed. Most focused on furthering theoretical knowledge and developing applications to improve learners' learning outcomes or attitudes. Although the results of their review affirmed the great promise of EDR due to its integration of educational theory and practice, Anderson and Shattuck (2012) argued that work still needs to be done regarding educational innovations. Moreover, they recommended that future reviews perform a more detailed investigation of the full text of articles and investigate a broader set of articles. Their characterisation of EDR has been cited numerous times.
According to McKenney and Reeves (2013), most of the EDR characteristics defined by Anderson and Shattuck (2012) are similar to those reported by other authors. However, McKenney and Reeves identified that departure from a problem is an important characteristic of EDR that is missing from Anderson and Shattuck's (2012) list. Moreover, they criticized Anderson and Shattuck's (2012) systematic review for its limited search terms (design-based research and education), narrow dataset (i.e., only the most cited articles), and the use of only abstracts for a number of analyses (McKenney & Reeves, 2013). They called for the use of diverse search terms, an adequate dataset, and in-depth analyses of full texts to assess EDR progress in future studies (McKenney & Reeves, 2013). Kennedy-Clark (2013) provided an overview of EDR as well as emphasised Plomp's (2007) three phases of EDR (i.e., initial, prototyping, and assessment phases) and the contribution of iterative cycles to the development of design principles and the refinement of theories. Furthermore, she investigated how EDR characteristics were used in doctoral dissertations by critically reviewing six education dissertations utilising EDR that were published by different institutions in Australia, Europe, Africa, and North America from January 2000 to January 2013. Her search terms included design research, design-based research, education, phases, cycles, and iteration. The research contexts (i.e., teaching subjects and education levels), focuses, and duration of data collection cycles varied among the dissertations, but they all utilised mixed methods for data collection. Conducting iterative data collection phases, engaging with several expert groups, testing designs with different participation groups, and being flexible and adaptive appeared to assist the researchers in reflecting on their research, understanding the educational problem, and avoiding overstated claims and conclusions. Finally, Kennedy-Clark's review demonstrated that the use of iterative design and development cycles or micro phases could increase the reliability and trustworthiness of research.
As researchers interested in EDR, we appreciate Kennedy-Clark's in-depth review of the potential benefits of EDR for education dissertations. However, the method was not sufficiently elaborated, and no overview of the information in the dissertations was provided. Revisiting the original article (Kennedy-Clark, 2013), Kennedy-Clark (2015) highlighted that researchers tend to concentrate on publishing their research findings and neglect to report their research methodologies. Therefore, there is a need for further investigation of how researchers employ EDR in their studies (Kennedy-Clark, 2015). Zheng (2015) noted that applications of EDR do not appear to live up to expectations. She investigated empirical studies that adopted EDR through a systematic review of 162 journal articles published between 2004 and 2013 and quantitative content analysis of the selected EDR studies in terms of demographics, research methods, intervention characteristics, and research outcomes. The findings show that higher education was the most common sample group, and natural science was the most commonly studied learning domain. Qualitative methods were most often adopted, mixed methods were the second most popular, and solely quantitative methods were not used in any studies. Nearly all studies collected miscellaneous data, including interviews, questionnaires, and notes; and most performed technological interventions. More than half of the studies designed, developed, and redesigned educational interventions in only one iteration cycle. Although the majority revised their interventions, only approximately half of the studies reported how they did so. Moreover, most studies relied heavily on measurements of learners' cognitive outcomes. Based on her findings, Zheng (2015) proposed that there is a need for EDR studies to apply multiple iterations and new approaches that pay more attention to the design process.
We value her work for its thorough review of a large number of EDR studies and because it improves the understanding of the EDR landscape over the past decade. Nevertheless, a more detailed qualitative analysis would have complimented her quantitative analysis and contributed to an even deeper understanding of the selected studies. Zheng (2015) recognised the shortcomings of her research and recommended more deliberate investigation and analysis of design activities and their functions.

Dissertation search and selection
To investigate how EDR has been employed in research on mathematics, science, and technology education and which challenges have confronted EDR researchers, we conducted a systematic review based on the recommendations of Anderson and Shattuck (2012), Kennedy-Clark (2015), McKenney and Reeves (2013), and Zheng (2015; see Section 2.3). Our data was collected from Finnish doctoral dissertations on mathematics, science, and technology education published between January 2000 and October 2018. We chose dissertations as our dataset because they report all iterative phases of the completed research, unlike articles, which often report only specific phases of research. We focused on Finnish dissertations because, as researchers in Finland, we expected our familiarity with the Finnish education system and practices to assist our review. It was not feasible to review all related dissertations completed at all Finnish universities because each university's repository uses a different database system, and there is no shared database containing all Finnish dissertations. Therefore, we decided to retrieve our data from the institutional repositories of the five Finnish universities that awarded the most qualifications and degrees in 2014: the University of Helsinki, University of Jyväskylä, University of Oulu, University of Tampere, and University of Turku (Official Statistics of Finland, 2015). The repository of the University of Eastern Finland, which provided the fourth most qualifications and degrees in 2014 and where a number of EDR dissertations have been completed, did not support the use of search terms for data retrieval. We also tried to retrieve dissertations of the University of Eastern Finland from Finna, a collection of search services providing access to material from Finnish university libraries. However, the Finna portal did not support a full-text search, which we used in our systematic review. Thus, we excluded the University of Eastern Finland and included the University of Tampere instead. Although our list of dissertations is not comprehensive, we believe that it provides an overview of the various dissertations published in Finland.
Our search terms included different terminologies that have been used to describe EDR in both English (design research, design-based research/design based research, development research/developmental research, and design experiments) and Finnish (design-tutkimu*/suunnittelututkimu*, design-perustai*/designperustei*/suunnitteluperustai*/suunnitteluperustei*, kehittämistutkimu*, and design-eksperiment*). The initial search resulted in 625 dissertations. One of the authors and a research assistant screened these results using the following inclusion criteria: (1) at least one of the search terms is visible in the English or Finnish title, abstract, or keywords and (2) the full text is openly available digitally. After applying these criteria, 55 dissertations remained. Each of the authors independently read onethird of this list according to our own interests and expertise. Thereafter, we jointly decided to exclude dissertations that did not utilise EDR as a strategy of inquiry, leaving 49 dissertations. At the beginning of this research, we decided not to use search terms similar to mathematics OR science OR technology AND teach* OR learn* OR class* to locate all dissertations on mathematics, science, or technology education because doing so would not be possible. Instead, we carefully read the remaining EDR dissertations, identified which dissertations concerned mathematics, science, and technology education, and jointly excluded dissertations in fields other than mathematics, science, and technology education, such as other taught subjects (e.g., language, design, and nursing), skill and competence development, teaching and learning support, and learning environments in general.

Dataset
After the final screening process, the full texts of 21 EDR dissertations (10 in English and 11 in Finnish; 18 monographs and 3 article-based dissertations) on mathematics, science, and technology education from three universities (the University of Helsinki, University of Jyväskylä, and University of Oulu; n = 14, 6, and 1, respectively) remained for statistical and content analysis. Table 2  Among the EDR dissertations on mathematics, science, and technology education, those of Aksela (2005) and Juuti (2005) were the first two published at the University of Helsinki, that of Leppäaho (2007) was the first at the University of Jyväskylä, and that of Oikarinen (2016) was the only one published at the University of Oulu. Altogether, there were 19 supervisors for the 21 dissertations. Aksela, who completed her EDR dissertation in 2005, supervised nine dissertations (43%), while Lavonen supervised six dissertations (29%).

Data analysis
After the final screening, each author coded one-third of the dissertations using a jointly constructed coding table. The coding categories were initially based on the previous literature, but we regularly discussed and modified existing categories and added relevant categories during the coding to best answer our research questions.
We coded the dissertations according to the following categories: (1) use of EDR terms and theoretical frameworks, (2) research contexts (i.e., educational sectors, settings, and domains), (3) educational problems in practice and research outcomes, (4) research methodology (i.e., research methods, data collection methods, and data sources), (5) scale, collaboration, and researcher's roles, (6) EDR process (i.e., phases of EDR, iterations, alternative design interventions, and issues during development of the intervention), and (7) EDR challenges. After the coding, we analysed the coded data quantitatively and qualitatively. Our findings are presented according to these seven categories in tables, figures, and descriptive analyses in the following section.
During the study, we strived to enhance the validity and reliability of our study by performing a precise research process, making joint decisions, crosschecking our data and analysis, consulting the literature for interpretations of the data, and comparing our research results to previous studies.

Use of EDR terms and theoretical frameworks
EDR is referred to by a variety of names, and different scholars define it as having different goals, characteristics, and processes. Thus, investigating how EDR terms and theoretical frameworks have been used in dissertations published during the last two decades improves the understanding of how EDR is utilised and developed.
Of the four terms in each language used for our dissertation search, only threedesign research, design-based/design based research, and development/developmental research in English and designtutkimu*/suunnittelututkimu*, design-perustai*/designperustei*/suunnitteluperustai*/suunnitteluperustei*, and kehittämistutkimu* in Finnish -appeared in the titles, abstracts, or keywords of the 21 dissertations. The dissertations did not apply a uniform format: while all of the dissertations included English versions of the title and abstract, only 18 included Finnish versions. We counted the appearance of each term only once per dissertation. Vartiainen (2016) used two terms in her English abstract, and Hassinen (2006) used two terms in her Finnish abstract. Thus, we also included them in our data (English: n = 22; Finnish: n = 19). Note: Only 18 dissertations had Finnish abstracts. One researcher used two terms in the English abstract, and another used two terms in the Finnish abstract.  Figure 1 shows the frequency of each search term in the English or Finnish titles, abstracts, or keywords of the dissertations. The most commonly used English term was 'design research', which appeared in 13 dissertations (69%), followed by 'designbased/design based research' in 7 dissertations (32%). The most commonly used Finnish term was 'kehittämistutkimu*' (development/developmental research), which appeared in 16 dissertations (84%). Interestingly, for 13 of the 18 dissertations (72%) that provided the title and abstract in both languages, the English and Finnish terms were not consistent. These dissertations used 'kehittämistutkimu*' (development/developmental research) in their Finnish titles or abstracts but either 'design research' or 'design-based/design based research' in their English titles or abstracts.
Our search terms appeared in the titles of 12 dissertations (57% of the 21 dissertations). Of these, six (50%) included the search terms in their primary titles, such as "Design-Based Research of a Meaningful Nonformal Chemistry Learning Environment in Cooperation with Specialists in the Industry" (Ikävalko, 2017) and "A Design Research: Problem and Inquiry Based Higher Education of Chemistry" (Rautiainen, 2012).
The comprehensiveness with which EDR theoretical frameworks were presented in the methodology sections of the dissertations varied from relatively superficial to exceedingly thorough. To investigate the use of these theoretical frameworks, we focused on the main EDR literature cited in the dissertations' methodology sections, such as those regarding the principles, key characteristics, and processes of EDR. We found that early EDR works (e.g., Brown, 1992;Edelson, 2002;DBRC, 2003) and recent works (e.g., Anderson & Shattuck, 2012;McKenney & Reeves, 2019) were used as the main theoretical frameworks. The most cited article was that of Edelson (2002), which described the three types of theories (i.e., domain theories, design frameworks, and design methodologies) that can guide EDR. This article was cited in 18 dissertations (86%). The next most cited article was that of the DBRC (2003), which identified five characteristics of good design-based research and provided recommendations on how to increase the reliability and validity of EDR. This article was cited in 10 dissertations (48%). Of the Finnish EDR literature, Juuti and Lavonen's (2006) article concerning the three pragmatic features of EDR was cited by nine dissertations (43%).

Research contexts
To obtain an overview of the authentic educational contexts in which the EDR dissertations were conducted, we examined their research contexts, including the educational sector (i.e., educational levels based on the Finnish educational system), setting (i.e., formal education vs. nonformal education), and domain (i.e., teaching and learning subjects).

Figure 2. Frequency of educational sectors examined by the dissertations (n = 28)
Note: Five dissertations were carried out in more than one educational sector.
All of the dissertations were conducted in real-world educational contexts, and five were carried out in more than one educational sector. We included all of these sectors in our data (n = 28). Figure 2 shows a pie chart of the various educational sectors examined by the dissertations. Basic education (Grades 1-9; n = 11, 39%) was the most studied educational sector in the dissertations, while pre-primary school (n = 2; 7%) was the least. The majority of the 21 dissertations (n = 14, 67%) were conducted in a formal educational setting leading to formal qualifications, while the others were conducted in either a nonformal setting (n = 3, 14%) or in both types of settings (n = 4, 19%). The research interventions were conducted in various educational domains. Some researchers described these domains in a general way (e.g., science, mathematics, or technology in education), while others referred to specific subjects (e.g., chemistry and physics). We categorised our data accordingly. Moreover, we included upper secondary school statistics for mathematics, which is in line with the Finnish national core curriculum. Figure 3 illustrates that the most common domain was chemistry (n = 9, 43%), followed by science in general (n = 4, 19%) and mathematics (n = 4, 19%). In sum, the dissertations were conducted in various research contexts (i.e., educational sectors, settings, and domains). Table 3 illustrates the differences in the research contexts using three dissertations as examples.

Educational problems in practice and research outcomes
All dissertations took at least one of four types of practical educational problems as a point of departure. Two dissertations took two types of problem as a point of departure; thus, we also included them in our data (n = 23). Figure 4 shows that the most common problem (n = 11, 48%) was students' lack of motivation and interest (e.g., Vartiainen, 2016), low performance (Hassinen, 2006), or deficient understanding (e.g., Oikarinen, 2016). The second most common problem (n = 7, 30%) was a lack of teaching and learning materials (e.g., Hongisto, 2012) or challenges in adapting to a new teaching and learning environment (e.g., Nieminen, 2008). The third type of problem (n = 3, 13%) was a teachers' deficient understanding and pedagogical skills (e.g., Juntunen, 2015). The last type (n = 2, 9%) concerned changes in a new curriculum (e.g., Kallunki, 2009). With regard to the practical contributions of the dissertations, various educational interventions were developed to respond to educational challenges in practice. Kallunki (2009) developed a teaching model and a learning environment, and we included both in our data (n = 22). The most common type of intervention involved teaching and learning environments (n = 10, 45%), such as a virtual science club Students (n = 11, 48%) Teachers (n = 3, 13%)

Research methodology
The way in which EDR projects are conducted plays an important role in the success and reliability of those projects. Research triangulation is highly recommended to ensure the quality of EDR. Therefore, we examined how the triangulation of research methods, data collection methods, and data sources was implemented in the dissertations. We coded the research methods as qualitative, quantitative, and mixed methods (see e.g., Creswell & Creswell, 2018). Fourteen dissertations (67%) gathered and analysed data with mixed methods (i.e., both qualitative and quantitative methods), while the remainder (n = 7, 33%) used only qualitative methods. None were conducted with only quantitative methods. Nevertheless, some of those dissertations that adopted mixed methods did not utilise qualitative and quantitative methods equally. For example, Ratinen's (2016) dissertation consisted of three substudies, only the first of which adopted mixed methods (i.e., a qualitative and quantitative questionnaire).
The dissertations used various methods to collect empirical data. The most common data collection methods were observation and questionnaires (each of which was used by 15 dissertations), followed by written documents, such as essays, diaries, and reports (which were used by 14 dissertations), and then interviews and group interviews (used by 13 dissertations). Some dissertations used tests and exams (e.g., Nieminen, 2008), tasks and exercises (e.g., Juntunen, 2015), and design intervention analysis (e.g., Pernaa, 2011). With regard to data sources, approximately half (11 of 21) of the dissertations collected data from both students and teachers, while the other half (n = 10) collected data from only students or only teachers. Additionally, several dissertations collected data from sources other than students and teachers; for example, Ikävalko (2017) collected data from company specialists, and Vartiainen (2016) collected data from parents.
In addition to investigating the dissertations' data collection methods and sources, we investigated how they collected data with multiple methods and from multiple sources to enhance their research triangulation. The number of data collection methods used in each dissertation ranged from one (Hongisto, 2012) to seven (Juuti, 2005), and the majority used three (n = 7, 33%) or four (n = 5, 24%). The number of data sources used in each dissertation varied from one (e.g., Rukajärvi-Saarela, 2015) to five (Tuomisto, 2018). Most of the researchers collected their data from one (n = 7, 33%) or two sources (n = 10, 48%). We further analysed the research triangulation by using a matrix with two dimensions: the number of data collection methods used in each dissertation on the x-axis and the number of data sources used in each dissertation on the y-axis. As Figure 6 shows, the matrix is composed of four quadrants: (1) low diversity of methods and low diversity of sources (lower left quadrant), (2) high diversity of methods and low diversity of sources (lower right quadrant), (3) low diversity of methods and high diversity of sources (upper left quadrant), and (4) high diversity of methods and high diversity of sources (upper right quadrant). The majority of dissertations are located in the lower quadrants; nine dissertations (43%) had low diversity of methods and low diversity of sources, and eight (38%) had high diversity of methods and low diversity of sources. Only two dissertations (Loukomies, 2013;Vartiainen, 2016) had high diversity of methods and high diversity of sources. Table 4 provides four examples of dissertations from the far corner of each quadrant.

Scale, collaboration, and researcher's roles
The scale of the dissertations varied widely in terms of the size of the research team (from an individual researcher to a large multidisciplinary team), the number of research participants (from 15 to over 1000 participants), and the time taken to complete the dissertation (from 3 to 14 years). Eight researchers (38%) conducted their dissertations alone, while the remaining 13 (62%) collaborated with other researchers or disciplines. For example, Ratinen (2016) conducted his dissertation in collaboration with another researcher, and Nousiainen (2008) worked in a multidisciplinary team comprised of members from various fields, including educational sciences, natural sciences, mathematical information technology, game design (e.g., multimedia and graphic design), and stakeholders (e.g., industry representatives, biology and geography teachers, and students from several school levels).
Twenty researchers had one additional role besides that of a researcher. The majority (n = 13, 62%) of the researchers (e.g., Leppäaho, 2007) had three roles: a researcher who plans the research, collects data, and analyses data; a developer who designs and develops a design intervention; and a teacher who teaches in the research intervention. Seven researchers (33%), including Ekonoja (2014), had two roles: a researcher and a developer. Juntunen (2015) was the only one who had a single role: a researcher.

EDR process
To investigate the EDR processes used by the dissertations, we analysed the phases of EDR, iterations, alternative design interventions, and issues that were considered during the intervention development.
To analyse the EDR phases of the dissertations, we coded the progress of EDR according to three main phases: (1) preliminary research, (2) development phase, and (3) assessment phase (see Plomp, 2013). Although the EDR processes of the dissertations were presented in various ways using various terms (e.g., cases, cycles, phases, stages, and substudies), we found that all dissertations progressed through three main phases. However, the first phase (i.e., investigation of problems, needs, and context) was not fully conducted in several dissertations. For example, Hassinen (2006) did not empirically investigate needs or context and only reviewed the literature on school algebra, curricula, related theories, and textbooks; and Ekonoja's (2014) first phase was conducted as part of his master's thesis. Additionally, while the primary research and assessment phase was reported thoroughly in all dissertations, the development phase was rather brief in some examples (e.g., Oikarinen, 2016) and comprehensive in others (e.g., Juuti, 2005).
As an important characteristic of EDR is its iterative process of design, assessment, and redesign, we investigated the dissertations' iterations by examining revisions of the interventions and the number of multiple subcycles implemented throughout each dissertation (see McKenney & Reeves, 2019). Almost all researchers (n = 20, 95%) revised their interventions during their dissertations. Seven also refined their interventions after their final field trials. With regard to the number of multiple subcycles, 19 researchers (90%) revised their intervention through multiple subcycles. Thirteen (62%) employed two multiple subcycles, four (19%) employed three, one (5%) employed four, and one (5%) employed seven. In addition to performing seven multiple subcycles, Rukajärvi-Saarela (2015) refined her pre-and in-service teacher course after the final field trial. In contrast, two dissertations (10%) performed only one multiple subcycle. After the multiple subcycle, Hassinen (2006) did not revise her Idea-based Algebra teaching model, while Leppäaho (2007) developed his problemsolving materials further in a textbook.
To ensure that their interventions contributed to real-world settings, we also investigated whether any dissertations worked with alternative designs or considered issues besides pedagogy when developing the interventions. No one worked with alternative designs except Nousiainen (2008), whose first project included alternative user interfaces with layouts and different interaction styles and whose second project generated initial ideas and then integrated and developed them in greater detail.
With regard to the issues considered during intervention development, we found that besides pedagogical issues, most of the dissertations considered the needs of policymakers, particularly the National Core Curriculum, when developing interventions. Only a few dissertations considered other issues, such as practicality, usability, administration, and organisation. For example, when developing her ICT learning environment, Aksela (2005) considered pedagogy, the needs of policymakers, practicality (e.g., time, ease of use, resource availability, and classroom space), usability, and technical issues.

EDR challenges
Finally, we investigated which EDR challenges were encountered during the dissertations. The challenges in the dissertations can be classified into five categories, which are described below.
First, it was difficult to generalise the results due to the small number of research participants, the short length of interventions, the small number of iterative cycles, the insufficiency of relying only on qualitative data, or context-bound research results (e.g., Ekonoja, 2014;Kallunki, 2009). Second, the nature of EDR made it challenging to perform the research for the dissertations. For example, in Nousiainen's (2008) dissertation, it was difficult to compare the research results from different phases, and it was difficult for some participants to recall what happened at the beginning of a long intervention. In the case of Ekonoja (2014), the EDR interventions were typically innovative in nature, and thus there were no previous studies related to his research. Moreover, his intervention relied greatly on technology. Third, the researchers had limited resources in relation to the complexity of EDR, which requires a huge amount of work due to the need to gather and analyse a large dataset (Vartiainen, 2016) and explicitly document the whole process (Pernaa, 2011). Fourth, EDR was often conducted with multidisciplinary collaboration, which required mutual understandings and good teamwork (e.g., Ikävalko, 2017). Fifth, when they took on multiple roles, it was sometimes difficult for the researchers to maintain objectivity (e.g., Oikarinen, 2016).

Discussion and conclusions
Our study improves the understanding of how EDR has been utilised and developed and which challenges it has faced over the last two decades by systematically reviewing 21 Finnish doctoral dissertations on mathematics, science, and technology education. The findings indicate that all dissertations made practical and theoretical educational contributions. In line with the literature (e.g., DBRC, 2003;McKenney & Reeves, 2019;Plomp, 2013), all of the dissertations exhibited the characteristics of EDR, including the use of educational problems in practice as a point of departure, research in real-world settings, evolution through an iterative process (i.e., preliminary research, development, and assessment), development of practical interventions, and refining of theoretical knowledge. Moreover, the challenges faced by the researchers (e.g., high demand for conducting EDR with limited resources and the difficulties of multidisciplinary teamwork) are generally similar to those stated by other scholars (e.g., Brown, 1992;McKenney & Reeves, 2019). However, the dissertations were distinctly diverse in terms of the research context (i.e., educational sectors, settings, and domains), educational problems in practice, research outcomes, research methodology (i.e., research methods, data collection methods, and data sources), scale, and collaboration. Like the EDR reviews of Anderson and Shattuck (2012) and Zheng (2015), the findings support the plurality of EDR (see Bell, 2004). Our results indicate that it is feasible to conduct EDR dissertations in different educational sectors, in different settings and domains, at various scales, and with different research designs.
Based on our observations, we agree with other researchers (e.g., Easterday et al., 2017;Ørngreen, 2015;Zheng, 2015) that EDR still needs much more work. Thus, we propose several suggestions for future EDR. First, we encourage agreement between the terms used to describe EDR in different languages to promote consistency and avoid confusion. Second, as EDR is an emergent research approach (Easterday et al., 2017), recent literature should be consulted so that researchers can stay up to date. Third, in agreement with the DBRC (2003) and McKenney and Reeves (2019), we believe that the triangulation of research methods, data collection methods, and data sources is needed to better understand complex authentic phenomena and ensure the trustworthiness of EDR. Fourth, we support Kennedy-Clark (2013) and McKenney and Reeves (2019) and highly encourage multidisciplinary collaboration so that EDR researchers benefit from the expertise of others and increase the feasibility and robustness of their research. Fifth, in line with McKenney and Reeves (2019) and Ørngreen (2015), when developing the intervention, working with alternative designs and considering various issues faced by all people in real-world contexts can enhance the success of EDR and ensure that the intervention continues to be utilised in realworld settings. Sixth, we agree with McKenney and Reeves (2019), Kennedy-Clark (2015), and Zheng (2015) that design activities and processes should be further emphasised so that others can benefit from them. Finally, due to the appearance of EDR terms in the primary titles of six dissertations, which implies that there is an overemphasis on EDR at the expense of the subject of the research, and the fact that EDR requires substantial resources (Kelly, 2013), we recommend that EDR should be undertaken because of its appropriateness and utility rather than for its own sake.
Our research has several limitations. First, our systematic review included only 21 Finnish dissertations on mathematics, science, and technology education from five universities. A broader dataset in terms of both the number of universities, dissertations, and educational fields would greatly improve the understanding of the utilisation and development of EDR. Second, the large dataset (a total of 4187 pages), the lack of a shared writing structure, and the implicit reporting of information that was necessary for this review made it difficult to perform data coding and analysis. More resources for coding and analysis would increase the precision of the research results and decrease the workload of researchers conducting the review. Last, to gain an overview of the utilisation, development, and challenges of EDR, we adopted a broad perspective when systematically reviewing the use of EDR terms and theoretical frameworks, research contexts, educational problems in practice and research outcomes, research methodologies, the dissertation's scale and collaboration, the researcher's roles, EDR processes, and EDR challenges. While our review indeed provides an overview, a review focusing on specific issues would yield profound insights into EDR.