The transfer effect of computational thinking (CT)-STEM: a systematic literature review and meta-analysis

Background Integrating computational thinking (CT) into STEM education has recently drawn significant attention, strengthened by the premise that CT and STEM are mutually reinforcing. Previous CT-STEM studies have examined theoretical interpretations, instructional strategies, and assessment targets. However, few have endeavored to delineate the transfer effects of CT-STEM on the development of cognitive and noncognitive benefits. Given this research gap, we conducted a systematic literature review and meta-analysis to provide deeper insights. Results We analyzed results from 37 studies involving 7,832 students with 96 effect sizes. Our key findings include: (i) identification of 36 benefits; (ii) a moderate overall transfer effect, with moderate effects also observed for both near and far transfers; (iii) a stronger effect on cognitive benefits compared to noncognitive benefits, regardless of the transfer type; (iv) significant moderation by educational level, sample size, instructional strategies, and intervention duration on overall and near-transfer effects, with only educational level and sample size being significant moderators for far-transfer effects. Conclusions This study analyzes the cognitive and noncognitive benefits arising from CT-STEM’s transfer effects, providing new insights to foster more effective STEM classroom teaching.


Introduction
In recent years, computational thinking (CT) has emerged as one of the driving forces behind the resurgence of computer science in school curriculums, spanning from pre-school to higher education (Bers et al., 2014;Polat et al., 2021;Tikva & Tambouris, 2021a).CT is complex, with many different definitions (Shute et al., 2017).Wing (2006, p. 33) defines CT as a process that involves solving problems, designing systems, and understanding human behavior by drawing on the concepts fundamental to computer science (CS).Contrary to a common perception that CT belongs solely to CS, gradually, it has come to represent a universally applicable attitude and skill set (Tekdal, 2021) involving cross-disciplinary literacy (Ye et al., 2022), which can be applied to solving a wide range of problems within CS and other disciplines (Lai & Wong, 2022).Simply put, CT involves thinking like a computer scientist when solving problems, and it is a universal competence that everyone, not just computer scientists, should acquire (Hsu et al., 2018).Developing CT competency not only helps one acquire domain-specific knowledge but enhances one's general ability to solve problems across various academic fields (Lu et al., 2022;Wing, 2008;Woo & Falloon, 2022;Xu et al., 2022), including STEM (science, technology, engineering, and mathematics) (Chen et al., 2023a;Lee & Malyn-Smith, 2020;Wang et al., 2022a;Waterman et al., 2020;Weintrop et al., 2016), the social sciences, and liberal arts (Knochel & Patton, 2015).
Given the importance of CT competency, integrating it into STEM education (CT-STEM) has emerged as a trend in recent years (Lee et al., 2020;Li & Anderson, 2020;Merino-Armero et al., 2022).CT-STEM represents the integration of CT practices with STEM learning content or context, grounded in the premise that a reciprocal relationship between STEM content learning and CT can enrich student learning (Cheng et al., 2023).Existing research supports that CT-STEM enhances student learning in two ways (Li et al., 2020b).First, CT, viewed as a set of practices for bridging disciplinary teaching, shifts traditional subject forms towards computational-based STEM content learning (Wiebe et al., 2020).Engaging students in discipline-specific CT practices like modeling and simulation has been shown to improve their content understanding (Grover & Pea, 2013;Hurt et al., 2023) and enhance learning (Aksit & Wiebe, 2020;Rodríguez-Martínez et al., 2019;Yin et al., 2020).Another way is to take CT as a transdisciplinary thinking process and practice, providing a structured problem-solving framework that can reduce subject fixation (Ng et al., 2023).Aligning with integrated STEM (iSTEM) teaching, this approach equips students with critical skills such as analytical thinking, data manipulation, algorithmic thinking, collaboration, and creative solution development in authentic contexts (Tikva & Tambouris, 2021b).Such skills are increasingly vital for addressing complex problems in a rapidly evolving digital and artificial intelligence-driven world.
To better understand and evaluate CT-STEM transfer effects on students' cognitive and noncognitive benefits acquisition, we systematically review published CT-STEM effects using PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines (Moher et al., 2010).We employ meta-analysis to quantify these effects and identify moderating variables.The following research questions guide our study: RQ1: What cognitive and noncognitive benefits are acquired from CT-STEM's near and far transfer effects?RQ2: (a) What are the overall transfer effects of CT-STEM on cognitive and noncognitive benefits mentioned in Q1? and (b) What are the moderators of this effect?RQ3: (a) What are the near and far transfer effects of CT-STEM on cognitive and noncognitive benefits mentioned in Q1? and (b) What are the moderators of these effects?

Computational thinking (CT)
The concept of procedural thinking was first introduced by Papert (1980), who connected programming to procedural thinking and laid a foundation for CT (Merino-Armero et al., 2022).Although Papert was the first to describe CT, Wing (2006Wing ( , 2008Wing ( , 2011) ) brought considerable attention back to the term, a focus that continues to date (Brennan & Resnick, 2012;Chen et al., 2023a).Various other definitions have emerged in the literature, and there is no consensus definition of CT (Barr & Stephenson, 2011;Grover & Pea, 2013;Shute et al., 2017).The definitions of CT often incorporate programming and computing concepts (e.g., Israel-Fishelson & Hershkovitz, 2022) or consider CT to be a set of elements associated with both computing concepts and problem-solving skills (e.g., Kalelioglu et al., 2016;Piatti et al., 2022).From the former perspective, many researchers defined CT based on programming and computing concepts.For example, Denner et al. (2012) defined CT as a united competence composed of three key dimensions of CT: programming, documenting and understanding software, and designing for usability.An alternative defining framework (Brennan & Resnick, 2012), originating from a programming context (i.e., Scratch), focuses on CT concepts and practices, including computational terms of sequences, loops, conditionals, debugging, and reusing.
Viewed from the latter perspective, CT deviates from the competencies typically associated with simple computing or programming activities.Instead, it is characterized as a set of competencies encompassing domain-specific knowledge/skills in programming and problem-solving skills for non-programming scenarios (Lai & Ellefson, 2023;Li et al., 2020a;Tsai et al., 2021Tsai et al., , 2022)).Using this broad viewpoint, CT can be defined as a universally applicable skill set involved in problemsolving processes.For instance, ISTE and CSTA (2011) developed an operational definition of CT, which refers to a problem-solving process covering core skills, such as abstraction, problem reformulation, data practices, algorithmic thinking, automation & modeling & simulation, and generalization.Selby and Woollard (2013) proposed a process-oriented definition of CT based on its five essential practices: abstraction, decomposition, algorithmic thinking, evaluation, and generalization.Shute et al. (2017) provided a cross-disciplinary definition centered on solving problems effectively and efficiently, categorizing CT into six practices: decomposition, abstraction, algorithm design, debugging, iteration, and generalization.In all these cases, the essence of CT lies in a computer scientist's approach to problems, which is a skill applicable to everyone's daily life and across all learning domains.
The above classification of definitions mainly focuses on the cognitive aspect of CT.Other researchers have suggested that CT contains not only a cognitive component (Román-González et al., 2017) but also a noncognitive component, highlighting important dispositions and attitudes, including confidence in dealing with complexity, persistence in working with difficult problems, tolerance for ambiguity, the ability to deal with open-ended problems, and the ability to communicate and work with others to achieve a common goal or solution (Barr & Stephenson, 2011;CSTA & ISTE, 2011).
In short, while computational thinking (CT) is frequently associated with programming, its scope has significantly expanded over the years (Hurt et al., 2023;Kafai & Proctor, 2022).Building on these prior efforts, we define CT as a problem-solving/thought process that involves selecting and applying the appropriate tools and practices for solving problems effectively and efficiently.As a multifaceted set of skills and attitudes, CT includes both cognitive aspects, highlighting students' interdisciplinary practices/skills, and noncognitive aspects like communication and collaboration.

Integrating CT in STEM education (CT-STEM)
There is an urgent need to bring CT into disciplinary classrooms to prepare students for new integrated fields (e.g., computational biology, computational physics, etc.) as practiced in the realistic professional world.To address this, a growing body of research and practice has focused on integrating CT into specific and iSTEM lessons (Jocius et al., 2021).This integration, i.e., CT-STEM, refers to the infusion of CT practices with STEM content/context, with the aim of enhancing students' CT skills and STEM knowledge (Cheng et al., 2023).Accordingly, CT-STEM serves a dual purpose: one, it has the potential to foster the development of student CT practices and skills; and another, it simultaneously deepens students' disciplinary understanding and improves learning performance within and across disciplines (Waterman et al., 2020).Current research reveals two potential ways this integration facilitates students' STEM learning.First, integrating CT into STEM provides students with an essential, structured framework by characterizing CT as a thought process and general competency, with disciplinary classrooms offering "a meaningful context (and set of problems) within which CT can be applied" (Weintrop et al., 2016, p. 128).Key processes of this problem-solving approach include: formulating problems computationally, data processing for solving problems, automating/simulating/modeling solutions, evaluating solutions, and generalizing solutions (Lyon & Magana, 2021;Wang et al., 2022a).Engaging in these practices aids students in applying STEM content to complex problem-solving and develops their potential as future scientists and innovators, aligning with iSTEM teaching.
In addition, introducing CT within disciplinary classroom instruction transforms traditional STEM subject formats into an integrated computational-based approach.This way takes a specific set of CT practices naturally integrated into different STEM disciplines to facilitate students' content learning (Li et al., 2020b;Weller et al., 2022).Weintrop et al. (2016) identified four categories of CT practices in math and science education: data practices, modeling and simulation practices, computational problem-solving practices, and systems thinking practices.Engaging students in systems thinking practices can simplify the understanding of systems and phenomena within the STEM disciplines (Grover & Pea, 2013).Integrating CT involves students in data practices, modeling, simulation and/or using computational tools such as programming to generate representations, rules, and reasoning structures (Phillips et al., 2023).This aids in formulating predictions and explanations, visualizing systems, testing hypotheses, and enhancing students' understanding of scientific phenomena and mechanisms (Eidin et al., 2024).When comparing the previously mentioned two integrated ways, the first places specific attention on developing discipline-general CT, while the second emphasizes improving students' learning of disciplinary content and developing discipline-specific CT (Li et al., 2020b).
Practical aspects of CT-STEM have also been explored in the literature, including instructional strategies and assessment targets.Scholars have attempted different instructional strategies for CT-STEM implementation to achieve the designated educational purpose.These strategies can be categorized as instructional models (e.g., problem-driven strategies and project-based strategies), topic contexts (e.g., game-based strategies, and modelingand simulation-based strategies), scaffolding strategies, and collaborative strategies (Wang et al., 2022a) (see Table 1).Typically, in instructional models, CT is viewed as an essential competency, guiding students to create interdisciplinary artifacts and solve specific real-world problems.Li et al. (2023) integrated CT as a core thought model into a project-based learning process, focusing on student-designed products for practical problems.Compatible with instructional models, a variety of instruction strategies based on topic contexts have been used, such as game design, computational modeling and simulation, and robotics.These also called plugged-in activities, typically involve computer programming for performing STEM tasks (Adanır et al., 2024).In contrast, unplugged activities operate independently of computers, involving physical movements or using certain objects to illustrate abstract STEM concepts or principles (Barth-Cohen et al., 2019;Chen et al. 2023b).In combination with the above strategies, scaffolding strategies have been designed and utilized in CT-STEM to reduce students' cognitive load and provide support for their self-regulated learning, such as guidance and adaptive, peer-, and resourcescaffolding.In addition, educators have employed various collaborative strategies (e.g., Think-Pair-Share practice) to enhance students' cooperative and communicative skills in CT-STEM learning (Tikva & Tambouris, 2021a).
In short, the use of different types of instructional strategies serves as a significant factor in influencing the effectiveness of CT-STEM.
Prior research has focused on assessment targets within the cognitive and noncognitive domains (Tang et al., 2020;Wang et al., 2022a).The former includes direct cognitive manifestations such as knowledge and skills related to CT constructs and STEM constructs, as well as domain-general mental abilities such as creativity and critical thinking (Tang et al., 2020).Wang et al. (2022a) reported CT-STEM studies targeted cognitive domain assessments, which included assessments of students' CT concepts and skills, programming knowledge and skills, and STEM achievements.These constructs were mainly measured through tests, including validated and self-developed tests.Other researchers characterize CT as a general thinking skill and employ performance scales for measurement (e.g., Korkmaz et al., 2017;Tsai et al., 2019Tsai et al., , 2021)).The assessment of the noncognitive domain focused on students' dispositions and attitudes towards CT-STEM (Lai & Wong, 2022), including selfefficacy, interest, and cooperativity, mainly measured by surveys/scales.
In summary, CT-STEM has garnered considerable attention from researchers, primarily exploring theoretical interpretations of how a reciprocal relationship between STEM and CT can enrich student learning.CT-STEM is implemented through the development and application of varied instructional strategies, with assessments aimed at understanding its effects on students' cognitive and noncognitive domains.While these are important contributions, there is a notable lack of systematic and empirical evidence concerning the differentiated benefits of CT-STEM integration.We aim to address  Xia and Zhong (2018) this deficit by differentiating benefits via transfer effects and systematically synthesizing pertinent research in this field.

Transfer effect of learning
Transference or transfer effect refers to the ability to apply what one has known or learned in one situation to another (Singley & Anderson, 1989), standing at the heart of education as it highlights the flexible application of acquired knowledge (OECD, 2018).Perkins and Salomon (1992) defined transfer as the process of transferring learning and performance from one context to another, possibly even in a dissimilar context.From a cognitivist perspective, knowledge, seen as a stable mental entity, can traditionally be summoned and adapted to new situations under the right circumstances (Day & Goldstone, 2012).Nevertheless, this traditional approach has been subject to extensive criticism, particularly from those who hold a constructivist perspective.From their view, the transfer of learning is not a static application of knowledge to a new context but rather the "byproduct of participation in particular situations" (Day & Goldstone, 2012)-a standpoint widely acknowledged and endorsed by most researchers.Despite the broad consensus on this view (Scherer et al., 2019), some questions remain: How can a successful transfer occur?What factors define "other" or "new" contexts?One prominent explanation for the successful transfer of knowledge is the theory of "common elements" (Singley & Anderson, 1989), which hypothesizes that successful transfer depends upon the elements that two different contexts or problem situations share (Scherer et al., 2019).Thus, based on this theory, the transfer effect can be divided into near transfer and far transfer (Perkins & Salomon, 1992).Near transfer occurs when successful skills and strategies are transferred between contexts that are similar, i.e., contexts that are closely related and require similar skills and strategies to be performed; conversely, far transfer occurs when successful skills or strategies are transferred between contexts that are inherently different (Perkins & Salomon, 1992).Essentially, the transfer effect is determined by the similarity or overlap between the contexts and problems in which the skills were acquired and new different problems that are encountered in the future (Baldwin & Ford, 1988).Simply put, there is a greater chance of transference between related contexts or problem situations (near-transfer) than between divergent situations (far-transfer).Since transfer effects are inherently situation-specific, they depend highly on the circumstances under which the skills/knowledge were acquired and the overlap with the new situation (Lobato, 2006).
While far-transfer effects are less likely to occur, numerous studies have reported far-transfer effects, albeit to varying extents (Bransford & Schwartz, 1999).Scherer et al. (2019) reported a moderate effect (g = 0.47) indicative of far transfer effects in learning computer programming, while Sala and Gobet (2016) found relatively limited evidence of far transfer effects within the domains of chess instruction and music education: successful transfer was only observed in situations that required skills similar to those acquired in the interventions.The extent of far-transfer can fluctuate across different contexts, indicating a need for further exploration within different disciplines and learning contexts.

The transfer effects of CT-STEM
The transfer effects of learning computer programming have been explored (Bernardo & Morris, 1994;Pirolli & Recker, 1994;Scherer et al., 2019Scherer et al., , 2020)).For instance, students learning BASIC programming demonstrated that acquiring programming knowledge significantly enhanced the students' abilities to solve verbal and mathematical problems; however, no significant differences were found in mathematical modeling and procedural comprehension (Bernardo & Morris, 1994).Scherer et al. (2019) conducted a meta-analysis exploring the effects of transferring computer programming knowledge on students' cognitive benefits.They identified positive skill transfers from learning programming to areas such as creative thinking, mathematical abilities, and spatial skills.Beyond cognitive benefits, Popat and Starkey (2019) and Melro et al. (2023) indicate that learning programming also contributes to noncognitive benefits like collaboration and communication.
Programming can be a conduit for teaching, learning, and assessing CT and a mechanism to expose students to CT by creating computational artifacts.Although programming skills and CT share a close relationship and overlap in several aspects (e.g., application of algorithms, abstraction, and automation), they are not identical (Ezeamuzie & Leung, 2022)-the latter (i.e., CT) also involves incorporating computational perspectives and computational participation (i.e., the student's understanding of himself or herself, and their interactions with others and technology; Shue et al., 2017).CT can also be taught without programming through so-called unplugged activities.Hence, research on the transfer of programming only addresses a limited aspect of the CT transference.
Research on CT transfer effects has recently surged (Liu & Jeong, 2022;Ye et al., 2022).In a meta-analysis, Ye et al. (2022) reported a positive transfer effect beyond computer programming in understanding science, engineering, mathematics, and the humanities.Using in-game CT supports, Liu and Jeong (2022) reported a significant improvement in student CT skills at the near transfer level but not at the far transfer level.Correlation analyses by Román-González et al. (2017) demonstrated a significant relationship between CT and other cognitive abilities, which is collaborated by Xu et al. 's (2022) study, showing CT relates to numerous cognitive and learning abilities in other domains, such as reasoning, creative thinking, and arithmetic fluency.Other studies attribute cognitive benefits to CT, such as executive functions (Arfé et al., 2019).Although the results from correlation analyses cannot provide definitive causal evidence, they offer valuable insights and directions for future investigations, including potential meta-analysis studies.
While several systematic reviews and meta-analyses have been conducted on programming and CT transfer effects, there is a scarcity of meta-analysis that investigate the transfer effects of CT-STEM and the variables that moderate these effects.Cheng et al. (2023) explored the overall effect of CT-STEM on students' STEM learning performance within a K-12 education context and reported a large effect size (g = 0.85) between pretest and posttest scores on STEM learning outcomes.They investigated moderating variables in the models, including student grade levels, STEM disciplines, intervention durations, and types of interventions.Of these, only the intervention durations had a significant moderating effect.While their work offers evidence supporting the effectiveness of CT-STEM on students' learning outcomes, evidenced by a large effect size, we identified three notable shortcomings: First, their meta-analysis lacked a focus on potential benefits that can be derived from CT-STEM integration, particularly in terms of differentiating learning outcomes from the perspective of transfer effects.Existing meta-analyses have found that effect sizes vary considerably across various types of learning outcomes (Sala & Gobet, 2017;Scherer et al., 2019).This variation indicates that CT-STEM may not benefit different categories of learning outcomes equally.Second, the study focused only on cognitive learning outcomes, omitting noncognitive effects that may be fostered by CT-STEM.As noted earlier, although CT is primarily a cognitive psychological construct associated with cognitive benefits, it also has a complementary noncognitive aspect (Román-González et al., 2018).The synergy between CT and STEM holds promise for delivering cognitive and noncognitive benefits to students.Third, their inclusion of only studies that employed onegroup pretest-posttest designs may contribute to biased outcomes, limiting the potential representativeness and robustness of the research findings (Cuijpers et al., 2017).Morris and DeShon (2002) posited that combining effect sizes from different study designs, both rationally and empirically, would lead to more reliable and comprehensive conclusions.
While various studies have validated the transfer effect of programming and CT, a systematic examination of CT-STEM's transfer effects remains an area for further exploration.Our review identified key gaps, including a lack of differentiation in learning outcomes, insufficient focus on noncognitive benefits, and limitations in research robustness.Additionally, practical challenges, such as identifying effective activities and methods for CT integration into STEM, as well as determining optimal intervention durations, need to be addressed.We address these issues by investigating the transfer effects of CT-STEM, combining effect sizes from diverse studies, and considering both cognitive and noncognitive domains.We also identify practical factors that could influence these effects through moderator analysis.Our goal is to enhance instructional design in CT-STEM and provide new insights and guidance for both practitioners and researchers in the field.

Conceptual framework for the present study
Drawing from Mayer's (2011Mayer's ( , 2015) ) framework, we synthesized evidence on the CT-STEM transfer effects and the contextual conditions that enhance instructional effectiveness.This framework, widely used to evaluate technology-based interventions like computer programming and educational robotics (Chen et al., 2018;Sun & Zhou, 2022;Tsai & Tsai, 2018), offers a multifaceted perspective on instructional methods.It allows for the exploration of three types of research questions: (a) Learning consequences, by examining the benefits of specific instructional methods; (b) Media comparison, by assessing the effectiveness of instructional methods; and (c) Value-added teaching, by investigating how changes in teaching conditions affect student performance.Chen et al. (2018) highlights this framework's aptitude for systematically organizing and mapping domains and study contexts, accommodating diverse research foci.
Transferring this framework to the context of CT-STEM instruction (see Fig. 1), we systematically summarize the learning sequences through CT-STEM's transfer effect.Based on our literature review section, we have categorized these sequences into four types: (a) Cognitive benefits through near transfer effect (CNT); (b) Noncognitive benefits through near transfer effect (NCNT); (c) Cognitive benefits through far transfer effect (CFT); and (d) Noncognitive benefits through far transfer effect (NCFT).This study synthesizes evidence on CT-STEM's effectiveness per transfer type and examines various moderators affecting these effects.We considered sample features (e.g., educational level and sample size) and study features (e.g., study design, subject, instructional strategy, and intervention duration) as potential moderators affecting the transferability of CT-STEM.Previous CT-related studies indicated that these moderators contribute to variance in the effect sizes (Lai & Wong, 2022;Scherer et al., 2020;Sun & Zhou, 2022;Ye et al., 2022).

Methodology
We collected and analyzed literature on the transfer effects of CT-STEM using a rigorous systematic review process (Jesson et al., 2011), adhering to the PRISMA guidelines (Moher et al., 2010).

Database and keywords
We initially searched for key works on CT and STEM in seven databases: Web of Science, Science Direct, Springer, Wily, IEEE Xplore Digital Library, Sage, and Taylor & Francis.In the search, CT was explicitly confined to "computational thinking." The major intervention approaches were included, such as programming, plugged activities, and unplugged activities.For STEM, we used the following terms: STEM, science, technology, engineering, and mathematics, and further supplemented "science" with discipline-specific terms like "physics, " "chemistry, " and "biology." Additionally, we added "game design" and "robotics" to complement "technology, " as these are significant technical contexts for CT.As a final step, we searched for full peer-reviewed articles in the databases using keyword groupings, focusing exclusively on educational and educational research fields: ("Computational thinking" OR "programming" OR "plugged activity" OR "unplugged activity") AND ("STEM" OR "technology" OR "engineering" OR "mathematics" OR "physics" OR "chemistry" OR "biology" OR "game design" OR "robotics").The initial search included articles published between January 1, 2011, and March 1, 2023, as professional CT-STEM fields were formed and gained popularity after 2011 (Lee & Malyn-Smith, 2020;Malyn-Smith & Ippolito, 2011).This initial search yielded 12,358 publications, which were then subjected to further screening.

Inclusion and exclusion criteria
The inclusion and exclusion criteria for articles were detailed in Table 2.This study examined the transfer effects of CT-STEM, exploring both near and far transfer effects on cognitive and noncognitive benefits acquisition.Eligible studies included those with experimental or quasi-experimental designs, such as Independent-groups pretest-posttest (IGPP), Independent-groups posttest (IGP), and Single-group pretest-posttest (SGPP), reporting pretest and posttest or solely posttest performance.Articles where CT was not integrated with STEM content or context, or if the authors did not conceptualize or assert their studies as integrating CT with STEM learning, were excluded.Studies focusing on programming tools like Scratch or robotics, without involving other STEM content or contexts were excluded.Since STEM education often emphasizes situated learning, with contexts from social studies, culture, language, and arts (Kelley & Knowles, 2016), articles in other disciplines (e.g., social sciences, literacy, and culture) that involve CT activities, such as designing digital stories and games (Zha et al., 2021), were included.We did not limit the educational context (e.g., K-12 or higher education) since

Study selection
Figure 2 shows the three selection stages: identification, screening, and eligibility evaluation.After the initial search, automatic and manual searching were used to eliminate duplicates.Two independent researchers used the inclusion and exclusion criteria to screen the article titles and abstracts, eliminating those that did not fit the Fig. 2 The selection flowchart used based on PRISMA approach criteria.Following this, the texts of the remaining articles were scrutinized and assessed using the criteria requirements for inclusion in the final sample.The interrater agreement was high (Cohen's Kappa coefficient = 0.92).All disagreements were resolved by discussing and reviewing.This selection process yielded 32 studies that met the eligibility criteria.Lastly, a "snowball" search method (Petersen & Valdez, 2005) was used to find additional articles that met the criteria.Both backward and forward snowballing using the identified papers resulted in an additional five papers.Overall, the search and evaluation process yielded 37 articles for analysis (a complete list of references for these included studies can be found in Supplementary Material A1).

Coding of studies
We modified the systematic review coding scheme spreadsheet (Scherer et al., 2019;Ye et al., 2022), which was used to document and extract information.It includes basic study details (reference, publication year, and journal), four types of outcome variables, sample features (educational level and sample size), study characteristics (study design, subject, instructional strategy, and intervention duration), and statistical data for effect size calculation.To ensure the reliability of the coding, each study was coded by two researchers using the coding scheme.The interrater reliability was 0.93 using the Kappa coefficient, and discrepancies were settled in discussion sessions until mutual agreement was reached.

Outcome variables
To ascertain which cognitive and noncognitive benefits can be derived through CT-STEM transference, we constructed a hierarchical structure and classified these benefits into four categories: CNT, NCNT, CFT, and NCFT (see Table 3).CNT (i.e., domain-specific cognitive skills/knowledge) occurs when skills or knowledge acquired in CT-STEM are applied to a domain that is closely related, such as CT knowledge/concepts and CT practices/skills (Scherer et al., 2019;Sun & Zhou, 2022).
In the included studies, CNT was measured using (a) validated tests, such as the Computation Thinking test (CTt), and (b) self-developed tests/tasks for evaluating students' comprehension of subject-specific concepts and knowledge.NCNT pertains to shifts in students' attitudes, motivations, self-efficacy, or perceptions concerning the related domain (e.g., CT-STEM, iSTEM, STEM, or programming) following their engagement with CT-STEM (Bloom & Krathwohl, 1956).Measures for NCNT in the selected studies primarily utilized standardized scales, with some employing self-developed scales.CFT (i.e., domain-general cognitive skills) manifests when the skills attained from the CT-STEM are applied to different domains (Doleck et al., 2017;Xu et al., 2022).These skills, such as reasoning skills, creativity, and critical thinking, were mostly assessed by standardized scales and various tests like the Bebras test, TOPS test, Computational Thinking Scale (CTS) (e.g., Korkmaz et al., 2017;Tsai et al., 2019Tsai et al., , 2021)), and Cornell Critical Thinking test (CCTT).NCFT involves the transfer of skills from CT-STEM to higher-order noncognitive learning outcomes such as cooperativity and communication (OECD, 2018).Measurement techniques for this category included validated scales along with specific self-developed tasks.Then, we calculated the measured frequency of each benefit in the selected papers (N = 37) and used bar charts for visualization to answer RQ1.

Moderator variables
Based on the framework presented in Fig. 1 and previous meta-analyses in CT-STEM and related fields (e.g., educational robotics, programming, and CT), we examined two types of moderators for their potential role in enhancing the transferability within CT-STEM (see Table 4).The variables included: (1) Sample features.Sample features comprised the educational levels targeted by the intervention-kindergarten, primary school, secondary school, and university/college-and the sample size, with the latter equating to class size in educational contexts and exhibiting variability across studies; (2) Study features.The design of the primary studies was coded as either an IGPP, an IGP, or a SGPP.Considering the possibility of multiple designs occurring within one study, we elected to code them independently (Scherer et al., 2020).Next to the subject, the coding of categories is primarily predicated on the intervention transfer area (Ye et al., 2022).When CT is integrated into several subjects, we coded such studies as "Multiple STEM subjects" accordingly.Based on Wang et al. 's (2022a) review, we assigned instructional strategy as additional possible moderating variables and coded them as "instructional models, " "topic contexts, " "scaffolding strategies, " and "collaborative strategies." Table 1 provides an account of these instructional strategies and contains sample references; Supplementary Material A2 contains more detailed descriptions of these strategies for each included study.Finally, the length of the intervention was extracted and later coded as < 1 week, one week-1 month, one month-1 semester, > 1 semester, and not mentioned.

Calculating effect sizes
We computed effect sizes using the Comprehensive Meta-Analysis (CMA) Software 3.0 (Borenstein et al., 2013).To increase the number of articles in our metaanalysis, we included three types of study designs (Morris & DeShon, 2002).Despite potential time bias and selection bias, our study used the same metric (i.e., raw-score metric) for calculating effect sizes.This metric is insensitive to variations in ρ and is recommended when homogeneity of ρ cannot be assumed or tested empirically (Morris & DeShon, 2002).These calculations were based on the means and standard deviations of the student learning outcome data.If these values were not reported in the studies, we used other statistics to calculate the standardized mean difference, such as t-values, z-scores, F-values, Cohen's d, SE, and    Confidence intervals (95% CI) (Borenstein et al., 2009).All reported p-values are two-tailed unless otherwise reported.

Sample features Educational level
We calculated the effect sizes by the metric of Hedges' g, which allows the integration of results from varied research designs with minimal bias and provides a global measure of CT-STEM effectiveness (Sun et al., 2021).Hedges' g was interpreted by Hedges and Olkin's (2014) assertion, in which 0.20-0.49indicates low effect, 0.50-0.79indicates medium effect, and 0.8 and above indicates high effect.CMA 3.0 empirically supports the amalgamation of multiple study designs in a single analysis (Borenstein et al., 2013).Leveraging this feature, we used experimental designs as a moderator to mitigate potential bias (Morris & DeShon, 2002).The statistically nonsignificant p-value of the Q test (p = 0.343) failed to reject the null hypothesis of no difference between mean effect sizes calculated from alternate designs.Therefore, effect sizes from different designs can be meaningfully combined (Delen & Sen, 2023;Morris & DeShon, 2002).Due to substantial variations in outcome measures and environments across studies, we employed the randomeffects model to address RQ2 (a) and RQ3 (a) in this study by calculating overall and subgroup effect sizes (Borenstein et al., 2021;Xu et al., 2019).

Non-independence
We calculated one effect size per study to ensure the independence of the effect sizes; however, if a study reported multiple benefits that did not overlap, the effect size for each benefit was included in the analysis.Additionally, when a study reported effect sizes for separate groups of students (e.g., students in grades 1, 2, and 3) where the participants did not overlap, the effect sizes for each group were considered independent samples (Lipsey & Wilson, 2001).When a study reported multiple assessments (e.g., midterm and final exams) in one subject area, we selected the most comprehensive assessment (Bai et al., 2020).

Analyses of heterogeneity
Heterogeneity was detected using the I 2 test (i.e., there is a degree of inconsistency in the studies' results), which was calculated to show the ratio of between-groups variance to the total variation across effect sizes, revealing the effect sizes variation stemming from the differences among studies (Shamseer et al., 2015).Then, we conducted a moderator analysis to pinpoint potential sources of variance in transfer effect sizes, including examining the overall, near, and far transfer effects, to address the RQ2 (b) and RQ3 (b).

Publication bias
We conducted three additional analyses to determine if publication bias affected the review results.They included a funnel plot, Egger's test, and the classic failsafe N. The funnel plot is a graphical tool that compares effect sizes to standard errors to check if publication bias distorted treatment effects (Egger et al., 1997).We used the Egger test to examine symmetry and quantify the amount of bias captured by the funnel plot (Bai et al., 2020;Borenstein, 2005).The classic fail-safe N was calculated to address the issue of publication bias affecting the effect size.Specifically, when the meta-analysis results are significant, it is essential to calculate the number of lost and unpublished studies that should be included to make the compound effect insignificant (Rosenthal, 1979).According to Rosenberg (2005), the fail-safe N (X) should reach 5 k + 10 to ensure that X is large relative to k (the number of independent effect sizes).The greater the fail-safe N value, the smaller the publication bias.

Cognitive and noncognitive benefits through CT-STEM's transfer effect (RQ1)
Our investigation of CT-STEM transference revealed 36 benefits, detailed in Fig. 3.This includes benefits from both near and far transfer: seventeen cognitive and eight noncognitive benefits were attributed to near transfer (CNT and NCNT, respectively), while nine cognitive and two noncognitive benefits resulted from far transfer (CFT and NCFT, respectively).
The top five benefits most frequently documented in empirical CT-STEM research were mathematics achievement (f = 9), CT knowledge/ concepts (f = 7), CT (f = 5), physics achievement (f = 5), and self-efficacy (f = 5).The notable medium frequency of certain NCNT, such as self-efficacy and motivation, highlights a dual focus in research: enhancing both cognitive skills and noncognitive gains in students involved in CT-STEM.There has been greater integration of CT into mathematics and science; however, other disciplines (e.g., biology, chemistry, social science, and culture) have received less attention.The limited observation of NCFT (only two identified) underscores the potential for broader research explorations.

CT-STEM's overall transfer effects and moderator analysis (RQ2) Overall transfer effects of CT-STEM (RQ2a)
In total, 37 primary studies involving 7832 students were included in the sample, yielding 96 effect sizes.Among these studies, 62% (23 studies) utilized an IGPP design, 35% (13 studies) adopted an SGPP design, and 3% (1 study) employed an IGP design.In this meta-analysis, we first analyzed 37 empirical studies using a random model.Our finding shows a significant overall effect size favoring the transfer effect of CT-STEM on both cognitive and noncognitive benefits for students (g = 0.601, 95% CI [0.510-0.691],Z = 12.976, p < 0.001) (see Fig. 4).The heterogeneity test results showed a significant Q value (Q = 853.052,I 2 = 88.864,p < 0.001), suggesting substantial heterogeneity in the study effect sizes.Thus, a moderator analysis of different contextual variables would be required in subsequent analyses.
To assess potential publication bias in our meta-analysis, we generated a funnel plot and performed the Classic Fail-safe N and Egger tests.As depicted in Fig. 5, the studies were primarily evenly distributed on both sides of the funnel plot and located in the middle to upper effective areas (Egger et al., 1997).The Classic Fail-safe N value was 4702, significantly exceeding the conservative threshold of 5 k + 10 (490).Moreover, Egger's Intercept was 1.01, [− 0.03-2.05]with a p-value of 0.06, which indicates no publication bias in our data set.

Moderator analysis of overall transfer effects (RQ2b)
We examined six variables as potential moderators, including educational level, sample size, study design, subject, instructional strategy, and intervention duration, using the random model to identify the origins of heterogeneity (see Table 5).The moderator analysis indicated no significant differences in effect size among various study designs (QB = 2.142, df = 2, p = 0.343).This suggests that different designs estimate a similar treatment effect, allowing for a combined analysis of effect sizes across designs (Morris & DeShon, 2002).Further, the analysis showed that the subject did not significantly moderate the CT-STEM benefits (QB = 13.374,df = 9, p = 0.146), indicating effective CT integration across various STEM disciplines (g = 0.567, p < 0.001).However, we observed a notable exception in social science (g = 0.727, p = 0.185), where the integration effect was not significant, in contrast to significant effects in subjects like engineering (g = 0.883, p < 0.001) and science (g = 0.875, p < 0.001).

Meta-analysis-Random model
Fig. 4 Forest plot of effect size (Hedges' g) in the random-effect model the highest effect size (g = 0.826, p < 0.001), while larger groups (over 150 students) showed the lowest (g = 0.233, p < 0.001), suggesting a decrease in effect with increasing class size.Instructional strategy was a significant moderator, indicating that the intervention strategy type significantly impacts CT-STEM's transfer effects.Strategies involving topic contexts (e.g., modeling, simulation, robotics, programming) had the largest effect (g = 0.647, p < 0.001), followed by scaffolding methods (e.g., (meta) cognitive scaffolding) (g = 0.492, p < 0.001), with the instructional model strategy showing the smallest effect (g = 0.394, p < 0.001).In addition, intervention duration was a critical moderator.The most significant effect was observed in interventions lasting between one week and one month (g = 0.736, p < 0.001), with longer durations showing diminishing effects.

CT-STEM's near and far transfer effects and moderator analysis (RQ3) Near transfer effect by cognitive and noncognitive benefits (RQ3a)
To further analyze the effect size of CT-STEM neartransfer, we focused on a subgroup encompassing both cognitive and noncognitive benefits, as detailed in Table 6.We observed that the effect size for CT-STEM near-transfer is 0.645 (95% CI [0.536-0.753],Z = 11.609,p < 0.001), indicating a moderate impact on near-transfer benefits, with cognitive benefits demonstrating a larger effect size (g = 0.672, 95% CI [0.540-0.804],Z = 9.978, p < 0.001) compared to noncognitive benefits (g = 0.547, 95% CI [0.388-0.706],Z = 6.735, p < 0.001).This suggests that CT-STEM interventions are more impactful on cognitive aspects, e.g., CT skills, programming abilities, and algorithmic thinking, than noncognitive aspects, such as self-efficacy, learning motivation, and attitudes.We utilized a funnel plot to assess and illustrate the publication bias of the study (see Fig. 6).The majority of the studies cluster in the effective area of the plot.The symmetric distribution of studies on the funnel plot's left and right sides suggests a minimal publication bias.Furthermore, Egger's test yielded a result of t (70) = 0.85 with a p-value of 0.40, reinforcing this indication.The Classic Fail-safe N was calculated to be 6539, substantially exceeding the estimated number of unpublished studies (5 k + 10 = 370).Therefore, these results collectively suggest that publication bias has a negligible impact on the CT-STEM's near-transfer effects.

Far transfer effect by cognitive and noncognitive benefits (RQ3a)
In examining CT-STEM far-transfer as a specific subgroup (see Table 6), we found a moderate effect size (g = 0.444, 95% CI [0.312-0.576],Z = 6.596, p < 0.001), indicating a significant positive impact of CT-STEM on students' generic skills, including creativity, critical thinking, and problem-solving.A comparison of effect sizes between cognitive and noncognitive benefits revealed that cognitive benefits (g = 0.466, 95%    The funnel plot for far-transfer effects (see Fig. 7) shows some degree of asymmetry, which was further substantiated by Egger's Test, yielding t (24) = 3.90 with a p-value of less than 0.001.Although the calculated Failsafe N (N = 794) is considerably larger than the threshold of 5 k + 10 (130), this discrepancy does suggest the possibility of some publication bias in the far-transfer effects of our study.

Heterogeneity and moderator analysis of near and far transfer effects (RQ3b)
We conducted heterogeneity assessments for each subgroup, focusing on near-transfer and far-transfer effects.The significant Q statistic values indicated high heterogeneity in both groups (Q near = 671.379,I 2 = 89.425%,p < 0.001; Q far = 93.552,I 2 = 75.415%,p < 0.001).We then explored moderating effects based on educational level, sample size, subject, instructional strategy, and intervention duration.The results showed that the near-transfer effect of CT-STEM is moderated by educational level, sample size, instructional strategy, and intervention duration (see Table 7).In contrast, the far-transfer effect is moderated only by educational level and sample size (see Table 8).These findings suggest that the near-transfer effect is more susceptible to contextual factors variations than the far-transfer effect.

Discussion and implications
This study examined the transfer effects of CT-STEM on students' cognitive and noncognitive skills.We conducted a systematic literature review and a meta-analysis approach.The main findings and implications of this study are discussed in the following sections.

Cognitive and noncognitive benefits through CT-STEM transfer effects
RQ1 asks what are the cognitive and noncognitive benefits derived from the transfer effects of CT-STEM.
From 37 empirical studies, we identified 36 benefits, categorized into four types: CNT, CFT, NCNT, and NCFT.These benefits are consistent with findings in prior studies (e.g., Melro et al., 2023;Román-González et al., 2018;Scherer et al., 2019;Tsarava et al., 2022;Ye et al., 2022), indicating CT-STEM provides cognitive and noncognitive benefits but also fosters development of domain-specific and domain-general skills.Most prior research has focused on CT-STEM's impact on students' mathematics achievement, CT skills/concepts, self-efficacy, and cooperativity.Our results further suggest that CT-STEM enhances cognitive skills while significantly contributing to affective and social learning outcomes.This finding supports the view that while CT is primarily cognitive, akin to problem-solving abilities, it has a significant noncognitive aspect (Román-González et al., 2018).An illustrative example is the study by Wang et al. (2022b), which developed a non-programming, unplugged-in CT program in mathematics, that effectively improved students' CT skills, cooperation tendencies, and perceptions of CT.
Most transfer studies to date have primarily focused on students' mathematics and science achievement, with less emphasis on other subjects like physics, biology, and chemistry.One reason is the overlap in thinking practices among these disciplines and CT (Rich et al., 2019;Ye et al. 2023).For example, modeling and simulating complex phenomena in these subjects foster problem decomposition skills, crucial in mathematics, science, and CS.Additionally, CT offers an analytical and systematic framework for problem-solving, a key aspect in tackling complex mathematical and scientific problems (Berland & Wilensky, 2015).Despite this, CT's potential in a wider range of subjects remains underexplored (Ye et al., 2022).Previous studies have identified potential challenges in integrating CT into diverse STEM disciplines (Kite & Park, 2023;Li et al., 2020a), and finding suitable curriculum topics that effectively utilize CT's benefits can be difficult.Beyond mathematics, CT-STEM transfer studies have looked at topics like ecology (Christensen & Lombardi, 2023;Rachmatullah & Wiebe, 2022), force and motion (Aksit & Wiebe, 2020;Hutchins et al., 2020aHutchins et al., , 2020b)), and chemical reactions (Chongo et al., 2021).This situation indicates a need for exploring a broader range of STEM topics to fully leverage the synergy between CT and STEM.
Our review identified only two far-noncognitive benefits of CT-STEM, suggesting these benefits may be harder to measure.Gutman and Schoon (2013) noted that farnoncognitive skills like perseverance and persistence

CT-STEM's transfer effects
For RQ2 (a) and RQ3 (a), our meta-analysis indicates positive impacts on both cognitive (g = 0.628) and noncognitive benefits (g = 0.510), each showing moderate effect sizes.This finding supports the use of CT-STEM in enhancing students' cognitive and noncognitive skills, as suggested by Lee et al. (2020), who argue that integrating CT in STEM encourages deeper engagement in authentic STEM practices, thereby developing a broad spectrum of skills, including cognitive and noncognitive aspects.
Our findings that cognitive benefits exhibit greater effect sizes than noncognitive benefits across both neartransfer and far-transfer, contrast with previous research by Kautz et al. (2014), which suggested noncognitive skills are more malleable.Two factors that might explain this disparity are gender and age.Gender may be a significant factor since CT-STEM requires students to utilize computational concepts, practices, and perspectives to solve complex, real-world problems, which can have inherent gender biases.For example, Czocher et al. (2019) found that female students often experience more frustration and lower engagement in CT-STEM, and similar studies report that they have lower interest, confidence, and selfefficacy than males (Wang et al., 2022b).Jiang and Wong (2022) found no significant gender differences in cognitive skills like CT, indicating that the differences might lie in the affective skill domains, suggesting that students' noncognitive skills might be less malleable than their cognitive skills in CT-STEM programs.As such, increasing students' motivation, especially among girls, is a crucial issue for future studies (Tikva & Tambouris, 2021b).Student age may be a contributing factor.Lechner et al. (2021) demonstrated that age influences skill adaptability, with younger individuals showing greater exploratory behavior and neural plasticity.Both characteristics are pivotal for cognitive development (e.g., reasoning skills and literacy) (Gualtieri & Finn, 2022), making cognitive skills more plastic than noncognitive skills.This aligns with our findings, where a significant proportion of studies (49%) focused on primary school settings, reinforcing the importance of early CT integration.
In comparing the near-and far-transfer effects, our analysis shows that the effect size for near-transfer is higher than that for far-transfer for both cognitive and noncognitive domains, aligning with previous findings that identified a strong effect of programming through near transfer (g = 0.75, 95% CI [0.39,1.11])and a moderate effect through far transfer (g = 0.47, 95% CI [0.35, 0.59]) (Scherer et al., 2019).One explanation is by the theory of "common elements" (Singley & Anderson, 1989), which suggests that skills developed through CT-STEM are more readily transferable to similar contexts due to shared conceptual commonalities and elements (Nouri et al., 2020;Scherer et al., 2019).Essentially, students proficient in a skill often find it easier to apply this proficiency to a related skill that shares foundational principles and strategies (Baldwin & Ford, 1988).Despite this, the far-transfer effects in CT-STEM do occur and are significant.We stress the importance of developing effective strategies that foster these far-transfer effects within the CT-STEM curriculum.One approach is identifying "common elements" and conceptual similarities between different discipline context and skills, thus promoting transference.

Contextual variables explaining variation in the CT-STEM's transfer effects
In our meta-analysis (Q2 (b) and Q3 (b)), we examined the heterogeneity of CT-STEM's overall, near-transfer, and far-transfer effects using moderators: educational level, sample size, study design, subject, instructional strategy, and intervention duration.For the overall transfer effects, we found significant variations in the effect size, with notably higher efficacy observed in grade school students than university students.This finding further advocates for the early integration of CT in STEM education (Nouri et al., 2020).This difference in CT-STEM's impact can be attributed to two factors: (1) It correlates with students' cognitive and noncognitive development, with early grades being crucial for acquiring these benefits (Jiang & Wong, 2022); (2) The hands-on, experiential nature of CT-STEM, utilizing tangible materials and interactive simulations, is particularly suited to the development and learning needs of young children (Thomas & Larwin, 2023).Also, class size emerged as a strong moderator (Li et al., 2022;Sun & Zhou, 2022;Sun et al., 2021), with smaller classes (under 50 students) showing more pronounced transfer effects.As class size increases, the impact of CT-STEM on skills development decreases, possibly due to logistical constraints e.g., space, equipment, and resources (Cheng et al., 2023).We also found significant differences due to instructional strategies.Learning activities involving computational modeling, simulation, and embodied learning yielded larger effect sizes.This supports constructivist educational methods like computational modeling for simulating complex phenomena and facilitating content learning (Basu et al., 2015;Sengupta et al., 2013).For intervention duration, we found that CT-STEM interventions of one week to one month are most effective in enhancing student's learning outcomes, after which the effect size diminishes, in agreement with Sun et al. (2021).This time frame window may be due to the need to balance learning time and ongoing students' interest and motivation, with extended durations leading to a decrease in motivation and interest as students adjust to the new learning method (Appleton et al., 2008;Cheng et al., 2023).Importantly, our analysis revealed that subject matter had little impact on CT-STEM benefits, suggesting broad applicability across various STEM subjects.
Our analysis of near-and far-transfer effects in CT-STEM shows that educational level, sample size, instructional strategy, and intervention duration significantly moderate near-transfer effects, while far-transfer effects are mainly moderated by educational level and sample size.One explanation is that near-transfer effects are linked to domain-specific skills, responding to particular instructional elements like strategies and duration (van Graaf et al., 2019).While far-transfer effects for domaingeneral skills like critical thinking show significant moderation primarily by educational level and sample size, rather than instructional design.This may be due to a predominant focus on domain-specific skills in current instructional designs (Geary et al., 2017).One attractive alternative is to consider CT as a transdisciplinary thinking practice and integrate it across various STEM subjects to enhance students' domain-general skills development (Li et al., 2020b).
The far-transfer effects are linked to cognitive development and social contexts, and thus influenced by educational level, which aligns with cognitive maturation and skill readiness (Jiang & Wong, 2021;Zhan et al., 2022).In addition, sample size also affects social skills and classroom dynamics (Sung et al., 2017;Yılmaz & Yılmaz, 2023).Therefore, in designing CT-STEM activities, it is crucial to consider age-appropriate objectives and learning content, as well as class size, for optimal development of cognitive and social skills.Future research should continue to explore these factors, particularly in developing social skills.

Theoretical and practical implications
This study provides new knowledge for CT-STEM research and informs CT-STEM instructional design and practice.This work extends the current understanding of CT-STEM's transfer effects on students' cognitive and noncognitive domains.Our findings support the premise that CT-STEM can significantly enhance the development of students' cognitive and noncognitive skills through near and far transfer.In addition, we provide a simple hierarchical structure that integrates cognitive and noncognitive domains through a transfer perspective (see Table 3).This structure can guide researchers in systematically classifying and identifying measurable constructs, leading to a more comprehensive understanding of student learning in CT-STEM.
Analysis of moderators provides actionable guidance for CT-STEM instructional design to capitalize on positive transfer effects.For overall and near-transfer effects, we encourage early integration of CT into individual and iSTEM disciplines through informed designed activities.We show that smaller class sizes (under 50 students), interventions lasting one week to one month, and strategic selection of instructional methods like computational modeling promote more effective transference (see Tables 5 and 7).Consequently, we recommend that educators and instructional designers prioritize creating collaborative learning environments using both in-person, hybrid, and online collaborative platforms, reducing logistical issues and allowing for closer monitoring of group interactions and timely feedback.Flexible curriculum design, with durations ranging from intensive one-week modules to longer month-long projects, is key to maximizing transference learning effects.Given computational modeling's central role in STEM (NGSS Lead States, 2013), we encourage educators looking to integrate CT into classroom teaching to consider it as a primary entry point.To support far-transfer, educators need to develop age-appropriate content and activities that align with students' cognitive development progression (Zhang and Nouri, 2019), alongside fostering a collaborative culture that nurtures social skills.For instructional models that have shown the greatest effect sizes (see Table 8), we strongly encourage teachers, especially those with prior experience in CT integration, to develop instructional models based on engineering design processes (Wiebe et al., 2020) that engage students in problem-solving and the creation of creative artifacts to foster their higher-order thinking skills.

Conclusion
This systematic literature review and meta-analysis examined the cognitive and noncognitive benefits of CT-STEM's transfer effects.Analyzing 96 effect sizes from 37 qualifying studies, we found: (a) 36 distinct CT-STEM benefits across four categories, namely, CNT, CFT, NCNT, and NCFT; (b) CT-STEM had overall medium and significant impacts on four categories of benefits (g = 0.601); (c) the effect size of near-transfer (g = 0.645) was greater than that of far-transfer (g = 0.444), and cognitive benefits (g = 0.628) consistently showed a larger effect size than noncognitive benefits (g = 0.510); (d) educational level, sample size, instructional strategy, and Li and Oon International Journal of STEM Education (2024) 11:44 intervention duration significantly moderated both overall and near-transfer effects, while far-transfer effects were significantly moderated only by educational level and sample size.Our findings provide a roadmap for curriculum designers and teachers to more effectively and efficiently integrate CT into STEM education at all grade levels, enhancing student development of both cognitive and noncognitive skills.
This study has several limitations.Although it uses a comprehensive review of the literature across seven databases, some specialized sources might have been overlooked.This highlights the need for future research to include more specialized/professional databases for an additional understanding of CT-STEM's transfer effects.While the standardization of effect sizes and moderator analysis helped to mitigate potential biases from diverse study designs, further methodological enhancements are warranted in future studies.The findings on noncognitive benefits through far transfer (NCFT), such as social competencies, are limited by the nature of the research dataset and the limited research available (Lai & Wong, 2022;Lai et al., 2023).This indicates a need for the rigorous development of measurement tools and instructional designs in this area.Finally, we investigated six moderators within CT-STEM but did not examine aspects like curriculum characteristics and teachers' experience.These areas, due to their qualitative nature and infrequent reporting in our sample studies, were not included but are significant avenues for future research.Despite these limitations, the study's contributions are significant, as it systematically elucidates the cognitive and noncognitive benefits from CT-STEM transfer effects and provides robust evidence.The identified moderators aid educators in facilitating the occurrence of transfer within classroom teaching.

Fig. 1
Fig. 1 Conceptual framework of the present meta-analysis

CNT=
Cognitive benefits through near transfer effect; NCNT = Noncognitive benefits through near transfer effect; CFT = Cognitive benefits through far transfer effect; NCFT = Noncognitive benefits through far transfer effect

Table 1
Instructional strategies for CT-

Table 2
Inclusion and exclusion criteria

Table 3
A classification structure of measured constructs and their corresponding measurements in the included studies

Table 4
Coding information of moderator variables

Table 5
Moderator analysis of the overall effect

Table 6
The summary of effect size for different subgroups

Table 7
Moderator analyses of the near-transfer effect

Table 8
Moderator analyses of the far-transfer effect