Using meta-analysis of technique and timing to optimize corrective feedback for specific grammatical features

Grammar varies in semantic, morphosyntactic, and phonological complexity, which may influence what type of Corrective Feedback (CF) is effective. The present study was designed to investigate the best form of CF for each grammatical feature, in conjunction with associated variables such as learner proficiency level and L1 influence. Fifteen studies, all of which had English as a Foreign Language (EFL) learners with a Korean L1 and productive measures of speech or writing, were selected for study. Results suggest that grammatical features which are very similar to the L1 may benefit more from implicit reformulations (recasts). Explicit CF (direct and metalinguistic feedback) appears more useful for semantically and syntactically complex features such as the English article and past hypothetical conditional. Finally, research of proficiency level suggests that timely emphasis of specific grammatical features is needed in order for CF to be effective.

working memory all have a significant impact on the efficacy of CF (Li, 2017;Lyster & Saito, 2010;Rassaei, 2015;Sheen, 2007b;Sheen, 2008). In addition to learner variables, environmental variables and their impact on CF have been heavily explored. In a metaanalysis of 15 classroom-based studies by Lyster and Saito (2010), for example, aspects of instruction like foreign language setting and treatment length were investigated. While the study suggests that foreign language setting is not a significant factor, duration was indeed significant, with longer treatments having larger effects.
Although examining the influence of both learner and treatment variables is an essential part of understanding CF, one major limitation in past research designs has prevented accurate assessment of efficacy or inefficacy. In earlier experimental studies, only one grammatical feature was traditionally used to assess efficacy or inefficacy of a specific technique. In a study by Sheen (2007a), for example, the definite article was used to assess direct-only and metalinguistic correction, with results suggesting that direct metalinguistic feedback was superior. In another study by Yang and Lyster (2010), prompts were found to be superior to recasts when teaching regular and irregular past tenses. While insightful, findings may only be generalized to the grammatical feature being targeted. We cannot say that metalinguistic feedback will be equally effective with the regular past, nor can we say that prompts will be more effective when used with the definite article. All grammatical features vary in semantic, morphosyntactic, and phonological complexity.
Albeit limited, research suggests that type of grammatical feature does indeed have an impact on the efficacy of CF. In a study with German as a second language learners, a more simplistic feature, the comparative, was examined along with a more complex feature, the dative; recasts were found to be more effective with the more simplistic feature. Prompts, in contrast, were found to be more effective with both features (Van De Guchte, Braaksma, Rijlaarsdam, & Bimmel, 2015). This study reveals that type of grammar does indeed determine which form of CF is most effective. While insightful, the study is limited in scope. It fails to comprehensively analyze enough grammatical features to be of any utility for educators. If the influence of diverse grammatical features is understood, educators can learn how and when to introduce CF that optimizes the acquisition process.

Literature review
Although characteristics of a grammatical feature may significantly impact the effectiveness of CF, this issue has not been comprehensively investigated (Schenck, 2017). Studies that do examine the complexity of grammatical features (Spada & Tomita, 2010;Van De Guchte et al., 2015) often rely on an overly simplistic dichotomy of simple and complex, which masks diverse phonological, morphosyntactic and semantic influences that vary along a continuum. More precise evaluation of grammatical features is needed for educators, who may need to choose pedagogical techniques based upon a specific target feature.
In another study of oral CF, characterization of grammatical features as late and early is equally problematic (Varnosfadrani & Basturkmen, 2009). So called "early" structures such as the definite article, irregular past, and plural -s differ in several distinct ways, which may affect exactly how and when they are acquired (Goldschneider & DeKeyser, 2005). The definite article is morphologically regular and relatively salient (includes a sonorant vowel) in phonological input, yet it is semantically complex; this feature is imbued with a variety of meanings that include general cultural use (e.g., the sun), immediate situational use (e.g., Don't go in there. The dog will bite you!), perceptual situational use (e.g., Pass me the salt.), and local use (e.g., the car/the pub) (Celce-Murcia, Larsen-Freeman, & Williams, 1999). Both the past irregular and plural -s are semantically simplistic (having only one meaning, either "past" or "more than one"), yet they differ in salience. Whereas the past irregular verb has sonorant vowels and consonants (e.g., went) that make it easy to hear, the plural -s feature often lacks a vowel sound (e.g., cats) and is difficult to perceive in both aural and written input.
As with the "early" classification, categorization of grammatical features as "late" appears to be overly simplistic. Structures such as the indefinite articles a and an, regular past, relative clause, passive voice, and third person -s, which were all classified as late (Varnosfadrani & Basturkmen, 2009), differ considerably in phonological, morphosyntactic, and semantic complexity. Concerning this classification, Li and Vuono (2019) wrote that "the third person -s is a very simple structure, and yet it is late-acquired" (p. 98). This statement reflects a fundamental problem with overly simplistic classification of grammatical features, which cannot explain why acquisition occurs at a particular time. While the third person -s may be semantically simplistic (denoting a meaning of "singular"), it requires an inter-phrasal understanding between the subject and predicate verb, which is known to be more difficult for learners to acquire (Pienemann, 2005). The feature is also phonologically difficult to perceive within input, which may slow acquisition (Goldschneider & DeKeyser, 2005). In addition to the third person -s, other late features differ significantly in phonological, morphosyntactic, and semantic characteristics, making a binary classification ineffective as a means for educators to choose the best form of CF.
Although classification as late or early does not accurately describe differences in grammar, it does appropriately introduce the notion that proficiency level influences CF for a specific target feature. Timing of corrective feedback may be a crucial determinant of efficacy. This view is supported in research, which suggests that a grammatical feature just above a learner's level of proficiency may be "teachable" (Dyson, 2018;Dyson & Håkansson, 2017;Pienemann, 1989). At initial stages of development, learners begin a second language by using basic lexical features such as nouns and verbs for agents and actions, respectively (Pienemann, 2005;VanPatten, 2004). Learners then expand morphology to develop nouns (e.g., the plural -s and indefinite article) or verbs (e.g., the progressive -ing and past -ed) into more sophisticated phrases. Considering this process of development, beginning learners may benefit from CF that emphasizes single words and adjacent morphological features. At an intermediate level of proficiency, learners begin to use grammatical features that reveal a link between two different words or phrases (Pienemann, 2005). During this stage, the third person singular -s, which links a noun (the agent) with a verb (the action), begins to emerge, as does the possessive 's, which links two nouns (e.g., Mary's book). Learners also begin to manipulate word order, further revealing an awareness of semantic and morphosyntactic connections between different words or phrases. In question inversion (What did you do?) and phrasal verbs (He ate the food up), both nouns and verbs are interconnected. Because learners at the intermediate level reveal a growing awareness of relationships on Schenck Asian-Pacific Journal of Second and Foreign Language Education (2020) 5:16 Page 3 of 20 an inter-phrasal level, CF that emphasizes relationships between multiple words and phrases may be more effective at this stage (Pienemann, 1999(Pienemann, , 2005. Finally, research of advanced learners, which reveals the emergence of grammatical features that connect multiple clauses (Pienemann, 1999(Pienemann, , 2005, may benefit from CF that targets larger clauses and sentences (e.g., relative clauses and conditionals).
If there is indeed a correct time to introduce a form of corrective feedback, learners will need enough cognitive capacity to handle the input, explicit information, and complexity associated with a specific grammatical feature. As CF becomes more explicit, cognitive load increases, just as load increases when grammatical features become more complex (Rahimpour & Salimi, 2010;VanPatten & Rothman, 2015). Because cognitive capacity and language proficiency may be a key determinant of the success of a pedagogical technique, what grammatical feature is being emphasized should be carefully considered. Explicit corrective feedback may fail if the learner does not have the cognitive resources to support concentration on the feature, explaining inconsistent findings within past research. While timely introduction of a feature is essential (Gholami & Zeinolabedini, 2018), very few studies have examined the impact of CF types at different proficiency levels. Furthermore, the effect of introducing grammatical features of different complexity at variable levels of proficiency is not well known. Without a holistic understanding of timing and grammatical features, educators lack the ability to effectively utilize corrective feedback.
A final influence on the efficacy of CF is the degree to which a target feature resembles the learner's L1 (Yang, Cooc, & Sheng, 2017). Research suggests that English features which are either highly similar or absent from an L1 can be learned implicitly through student involvement with English input; however, if an L1 morphosyntactic feature is similar, yet differs from the L2 in some notable way, this disparity may not be identified without explicit intervention (Williams, 1995). Recent research further confirms that the L1 may influence effectiveness of explicit grammar instruction. In a recent study by McManus and Marsden (2019), for example, explicit emphasis of L1 processing routines increased the speed, accuracy, and stability of L2 performance. As this research suggests, L1 similarities may aid in the acquisition of L2 features, while L1 differences may hinder acquisition and require more explicit correction. Because the L1 may affect whether implicit or explicit instruction is effective, similarity of an L2 feature to the L1 must be considered when studying the impact of CF.
Through reviewing aspects of grammatical complexity, as well as associated influences of learner proficiency and L1 transfer, it is clear that simplistic methodological classifications of grammatical features are inadequate. Some more recent studies have begun to explore the complexities of CF with grammatical features like syntactic downgraders, which are imbued with a great deal of both syntactic and semantic complexity (Nguyen, Do, Pham, & Nguyen, 2018). While an important step forward, methodological issues continue to limit application of results to practice. There remains an inadequate understanding of the impact of grammatical feature type, L1 influence, and proficiency level on the efficacy of CF.
Regarding prior research used to evaluate various forms of CF such as recasts, Goo and Mackey (2013) write the following: We conclude by suggesting that making a case against recasts is neither convincing nor useful for advancing the field and that more triangulated approaches to research on all types of corrective feedback, employing varied and rigorous methodological designs, are necessary to further our understanding of the role of corrective feedback in L2 learning (p. 127).
As this quote suggests, we need a more holistic understanding of CF before it can be effectively used. This quote further highlights a methodological problem with past studies, which declare that one form of corrective feedback is more effective than another. Such declarations lead to the impression that one type of CF is universally effective. It must be remembered that past experimental studies, which examine one grammatical feature with a select group of learners, may only reflect the efficacy of CF in a specified context. Since the effectiveness of CF may depend on a combination of factors, namely, the type of grammatical feature, L1 influence, and learner proficiency, more holistic study is needed, with comprehensive meta-analysis that allows educators to understand how and when to use CF.

Types of corrective feedback
Like grammatical features, forms of CF are highly diverse. Recasts, clarification requests, elicitation, metalinguistic feedback, and direct feedback represent just a few techniques used to emphasize grammatical features (Lyster & Ranta, 1997;Rezaei, Mozaffari, & Hatef, 2011;Tedick & De Gortari, 1998). Although there are a variety forms, CF may be separated into two basic categories, input-providing or output prompting, "based on whether the correct form is provided or withheld" (Li & Vuono, 2019, p. 94). Reformulations, which provide complete corrections of a student error, represent a form of input. Prompts, in contrast, which provide students with an opportunity to correct their own errors, is a means to emphasize output (Lyster & Saito, 2010).
In addition to the input/output dichotomy, an additional distinction may be made based upon implicit and explicit forms of CF (Lyster & Saito, 2010). Reformulations may be either implicit or explicit. Oral recasts, which provide immediate "reformulation of all or part of a student's utterance, minus the error" (Lyster & Ranta, 1997, p. 46), do not offer explicit information about an error. Because the learner must figure out what mistake has been made through listening to a corrected utterance, the technique is primarily implicit. Direct feedback in writing differs significantly from recasts in that it gives learners precise information about what error was made, and how to correct it. As Li and Vuono (2019) point out, this form of feedback, along with any other kind of written feedback, "is always explicit because learners have no trouble recognizing the corrective intention, regardless of how it is provided" (Li & Vuono, 2019, p. 94). Like reformulations, prompts may be either implicit or explicit. Implicit prompts such as clarification requests (teacher asks a student to clarify) and elicitation (teacher asks a student to clarify and waits for a response) do not explicitly point out an error. They merely ask students to reflect and reconstruct language. Metalinguistic prompts provide specific information about a grammatical error (without correcting the error), which makes the technique more explicit.
CF differs based upon input provided, output elicited, or degree of explicitness. These differences may, in turn, have an impact on specific grammatical features. Some forms of CF may be more difficult than others, or they may be more suitable for some grammatical features. Regarding reformulations (recasts and direct feedback), input is provided by the teacher, and may thereby serve as a scaffold. Oral recasts also provide phonological information that may help in the acquisition of features like the regular past, which are systematic and semantically simple, yet phonologically more complex (with phonetic variants of 't', 'd', and 'id'). Phonological attributes may explain why other studies of structured input show significant gains for past tense features (Benati & Angelovska, 2015;Benati & Batziou, 2019). Direct written feedback may also provide a kind of scaffold for more complex features, since the corrections can be carefully processed by a learner over time. Such a scaffolding effect may explain why semantically and morphosyntactically complex features like the past hypothetical conditional appear to benefit from this technique (Shintani, Ellis, & Suzuki, 2014). As for output-inducing prompts, they may hinder communication and provide a larger cognitive burden on the learner. At the same time, they may also be more effective when introduced at an appropriate time, such as when a learner has a conscious understanding of the error, but must work to internalize the correction. Such a perspective may explain results of past research, which argues for the superiority of prompts (Lyster & Saito, 2010;Yang & Lyster, 2010).
The explicit attribute of CF, along with characteristics of a learner's L1, may also determine what grammatical features benefit from emphasis. Explicit feedback appears to be most effective when a grammatical feature is negatively impacted by an L1 (Williams, 1995). Past hypothetical conditionals or relative clauses, which are head-initial in English, pose significant challenges for Korean or Japanese language learners who have a head-final L1 (Shin, 2015). Not surprisingly, explicit forms of CF (e.g., direct and metalinguistic feedback) have been shown to improve accuracy of the past hypothetical conditional feature for Japanese university learners (Shintani et al., 2014). Unlike features that differ cross-linguistically, English grammar that is absent from a learner's L1 may be learned more implicitly (Williams, 1995). This phenomenon may explain why explicit CF has not helped Japanese learners acquire the indefinite article, which is absent from both Korean and Japanese L1s (Shintani et al., 2014). Learners may acquire novel grammatical features from implicit input, meaning that explicit CF may be unnecessary or even distracting. As with features absent from the L1, grammar that is highly similar may benefit more from implicit forms of feedback. Such a perspective explains why acquisition of past regular and irregular tenses (similar to Korean both morphologically and syntactically) is enhanced for Korean learners when implicit prompts and recasts are utilized (Cho, 2012;Kim & Cho, 2017).
Review of literature suggests that different forms of corrective feedback should be selected based upon characteristics of a target feature. While this review has yielded some key insights, grammatical features and forms of feedback from past studies are limited in scope. Without large-scale examination of CF, educators cannot determine how and when to use feedback in diverse circumstances. In order to gain the holistic perspective needed for optimization of feedback, various types of CF must be evaluated with a number of different grammatical features; this evaluation must also be considered along with variables like learner proficiency level and L1, which may both impact the difficulty of an L2 grammatical feature. Through comprehensive examination of CF, we can increase our understanding, thereby allowing educators to introduce the most effective techniques based upon grammar and learner characteristics. Software may also be developed that effectively utilizes CF to enhance grammatical accuracy of EFL learners in South Korea and beyond.

Research questions
1. Which kind of corrective feedback (CF) is most effective for promoting accuracy in the production of speech or writing? 2. Does the efficacy of CF differ based upon the target feature? What types of CF are most effective with each grammatical feature? 3. Does the effectiveness of a CF type differ based upon proficiency level of the learner? 4. Does the effectiveness of CF differ based on L1 similarities or differences?

Method
This study was designed to examine the impact of different forms of CF based upon grammatical feature type, L1, and learner proficiency. Since differing L1s may influence the results of communicative tasks, only students with a first language of Korean were selected for study. All learners were Korean university students of English, which helped to ensure that effects due to transfer could be discerned.
To obtain relevant studies using Korean participants, publications from some of the most prominent organizations associated with English education in Korea were systematically searched: the Korean Association of Teachers of English (KATE), Applied Linguistics Association of Korea (ALAK), the Society for Teaching English through Media (STEM), the Korean Association of Foreign Language Education (KAFLE), the Korea English Education Society (KEES), the English Teachers Association of Korea (ETAK), and the Global English Teachers Association of Korea (GETA). Next, Google was systematically searched by using the keyword Korean with various search terms for grammatical features (plural, past tense, past regular, past irregular, passive, third person, questions, article, definite article, indefinite article, phrasal verb, verb particle, conditional) and types of corrective feedback (feedback, recasts, indirect feedback, direct feedback, metalinguistic feedback, prompts).
Since conscious knowledge of a grammatical feature does not necessarily correlate to accuracy in performance, only studies that used learner speech or writing for assessment (rather than multiple choice or grammaticality judgement tests) were selected for study. To help ensure that more implicit knowledge of the target feature was obtained, assessment in experimental studies needed to exhibit the following characteristics: communicate ideas, not rules; put pressure on learners to prevent conscious correction of language errors; focus on meaning not form; and avoid the use of metalanguage (Ellis, 2009). In order to be included within the present meta-analysis, each experimental study also needed to have: While proficiency level was also studied, inconsistency in reporting results meant that this variable could not always be concretely assessed. Studies which did not explicitly mention proficiency level were still included in this meta-analysis. Such inclusion helped to ensure that data obtained for other variables (grammatical feature type and L1) were more comprehensive. Because duration may also impact the results of a treatment, only studies conducted over one semester were chosen. Although there is some variability in treatment delivery time, Korean university classes (conducted twice a week at one and a half hours each) helped to enforce a level of consistency in duration. Nearly all treatments lasted from two to four sessions, which ranged from two weeks to one month (see Table 2 in Appendix for more information). Together, 115 studies related to the subject area were located. Of these studies, 15 met all the criteria for inclusion in the meta-analysis ( Table 2 in Appendix).

Independent variable: corrective feedback
While there are a number of different types of corrective feedback (Lyster & Ranta, 1997;Rezaei et al., 2011;Tedick & De Gortari, 1998), CF may be separated into prompts (output-producing) and reformulations (input-providing). Feedback may then be separated into explicit or implicit, which results in four main categories: implicit prompts, explicit prompts, implicit reformulations, and explicit reformulations. Implicit prompts include clarification requests and elicitation; implicit reformulations include recasts; Explicit prompts include metalinguistic feedback and indirect feedback (when an error is merely circled or underlined); and explicit reformulations include direct feedback (Table 1).
Different types of CF included in Table 1 (oral elicitation, recasts, metalinguistic feedback, indirect feedback, and direct feedback) were analyzed using the framework above, which is based upon the conceptualization of CF in Lyster and Saito (2010).
Within studies obtained for meta-analysis, all forms of CF in the implicit prompt and implicit reformulation categories were oral. All direct forms of feedback were in written form. Metalinguistic feedback, however, included both oral and written forms. Since oral feedback may differ in that it provides phonological input or exerts pressure to process errors more quickly, this form of feedback was separated into oral and written metalinguistic groups for analysis.

Independent variable: grammatical features
After studies were selected, the following grammatical features were available for study: past regular, past irregular, past hypothetical conditional, questions, definite articles, indefinite articles, participial adjectives, and verb-argument constructions (e.g., ask me to). In nearly all studies of the English article, with the exception of two (Kim, 2019;Yoo, 2016), definite and indefinite forms were combined in the assessment. To provide the most precise information possible, studies that examined only the indefinite article were separated from other studies that used a combined assessment. Some studies used integrated measures for multiple grammatical features. One study combined regular past and irregular past under a single category, past simple (Kim, 2012); another study combined past simple and present perfect verb tenses (Kim, 2009); and a final study combined participial adjectives with verb-argument constructions (Kim, 2002).

Independent variable: proficiency level
Proficiency levels were separated into six categories: low beginner, beginner, high beginner, intermediate low, intermediate, and high intermediate. These levels were assigned based upon proficiency levels designated within selected studies, which included levels that ranged from high beginner to high intermediate. TOEIC scores associated with 5 out of the 11 studies with proficiency designations reported TOEIC scores that were distributed in the following ranges: 300 to 400 for high beginner (Kim, 2009), 400 to 500 for low intermediate (Jang, 2016), 500 to 700 for intermediate (Kim, 2009;Kim, 2016;Kim, 2019), and 700 to 900 for high intermediate (Song & Lee, 2017).
Past inconsistencies with assignment of proficiency level must be considered when interpreting results in this meta-analysis. Assessment of proficiency was not standardized within past research and can be expected to vary. Despite such variability, rudimentary categorization by level may reveal some key relationships concerning corrective feedback and proficiency. Furthermore, such analysis can expose key limitations of past research, helping future researchers to design better experimental studies, which address deficiencies in our understanding of when and how CF should be used for different grammatical features. Any relationship concerning proficiency which is established in this metaanalysis will need to be verified through further experimental research. Of the 58 treatment groups within the study, only 41 had information for designating proficiency level.
Independent variable: L1 similarity L1 similarity describes the degree to which an English feature is similar to the L1. Using a method established by Luk and Shirai (2009), cross-linguistic comparison was conducted by examining the presence/absence of the feature and free/bound/lexical attribute. Since position of morphological or syntactic features (head-initial or head-final) may impact acquisition (Shin, 2015), this attribute was added to the assessment of morphosyntactic similarity or difference. Together, three categories were created: Absent, Present (present in L1 with some difference in the free/bound/lexical attribute or position in relation to the head), and Very Similar (present in L1 with the same free/bound/lexical attribute and the same head initial or head final position). Among the grammatical features available for study, the English article represented the Absent category (it is not present in the Korean L1).
Hypothetical conditional and question features represented the Present category, since they exist in the L1, yet have a difference in word order or use of morphology. The Korean form of the past hypothetical conditional has differences in both morphology and syntax. The Korean word meaning if (− ), for example, is a bound (rather than free) morpheme appearing head-final, which contrasts with the head-initial position in English. Korean questions differ from their L2 counterpart in that auxiliary verb and subject are not inverted.
Past regular and irregular tenses represented the Very Similar category, since these features are present in both languages, match the free/bound/lexical attribute, and appear in the same position in relation to the head (verb). Like the past regular tense in English, the Korean word did, (haett-da), is used as a suffix for many verbs to denote past. Some past verbs, however, are lexically transformed in novel ways, as in the Korean word for stay, (meo-moo-ru-da), which is changed into (meo-mool-leott-da) to add the past meaning.

Dependent variable: effect size
Effect size was calculated by inserting pretest scores (M2), posttest scores (M1), and associated standard deviations (SD2 and SD1) into the Cohen's d formula for effect size (Spada & Tomita, 2010, p. 307 Pretest scores were compared with posttest scores as a means to evaluated how much of an impact CF had on participants. Resulting effect sizes for each type of feedback (direct feedback, indirect feedback, metalinguistic feedback, oral recasts, and elicitation) were compiled into charts based upon grammatical features, L1 similarity, and proficiency level. Types of corrective feedback were analyzed in accordance with inputproviding, output-prompting, explicit, and implicit attributes of technique.

Results and discussion
Explicit techniques for written CF tended to be the most effective. Written metalinguistic feedback had the highest effect size (d = 1.99), while direct written feedback had the second highest effect size (d = 1.54). In contrast to these forms of explicit CF, other forms of explicit feedback such as indirect written feedback (d = .88) and oral metalinguistic feedback (d = .89) had relatively low effect sizes. Of the explicit techniques, indirect written feedback and oral metalinguistic feedback provide the least amount of scaffolding in information provided or time given for a response, respectively, explaining why these techniques may not be as effective in promoting accuracy. Of the implicit forms of CF, recasts had the highest effect size (d = 1.08), while elicitation (d = .36) had much lower effect sizes. Although past research suggests that prompts are superior to recasts (Lyster & Saito, 2010;Yang & Lyster, 2010), the opposite result has been found for the Korean EFL learners included in this study. Overall, lower values for most prompts, with the exception of written metalinguistic feedback, suggests that this kind of CF is less effective.
While average effect sizes for CF seem to provide some key insights, these effect sizes appear to vary considerably based upon what grammatical feature was emphasized (Fig. 1). Analysis of the simple past tense, for example, suggests that implicit recasts are much more effective than explicit forms of CF such as direct written, indirect written, and metalinguistic oral feedback. Direct (d = −.31) and indirect (d = −.59) written forms of feedback both had a negative impact on accuracy in production, suggesting that this style of CF may actually lower a learner's ability to accurately produce the target feature.
Findings may be explained through understanding the past regular and irregular tenses and their similarity to the Korean language. In the Korean L1, as in English, there is a regular past suffix and irregular past verbs. Due to the cross-linguistic similarities, implicit forms of feedback may be sufficient as a means to emphasize the feature, without adding unnecessary cognitive burden which may result from explicit forms of feedback.
Like forms of explicit CF, implicit oral prompts were less effective for the past tense. For the regular past tense, prompts were approximately one half as effective as the recast. For the irregular past tense, prompts were about one third as effective (Table 3 in Appendix). Lower effect sizes for prompts may reflect an influence from the Korean L1. Past regular and irregular tenses are morphologically and semantically similar in both languages, making phonological attributes the most challenging obstacle for acquisition. Whereas oral recasts provide phonological input suitable for learning how to pronounce the feature, implicit prompts do not. Taken as a whole, L1 and L2 similarities in morphology, along with phonological complexity (with regular past phonetic variants of 't', 'd', and 'id' and many past irregular forms), appear to explain why recasts are more effective. Neither oral prompts nor explicit forms of written feedback provide phonological information through CF.
Unlike the past tense, explicit forms of CF reveal larger effect sizes for semantically and morphosyntactically complex features like the past hypothetical conditional and article. For the conditional feature, direct feedback (d = 2.76) was slightly more effective than metalinguistic feedback (d = 2.67). Indirect written feedback, while not as substantial in effect size (d = 1.58), was the largest value of this feedback type, when compared to other grammatical features. Characteristics of the past hypothetical conditional may explain why explicit CF is so effective. The feature is present in the L1, yet it differs both morphologically and syntactically from its English counterpart. The Korean conditional uses a bound morpheme appearing in a head-final position, which differs from the free morpheme that is head-initial in English. Explicit forms of feedback can help a learner identify morphological or syntactic aspects of the L1 that are not allowed in English. Through written feedback, learners may also have the time they need to cognitively process features that are semantically similar (exist in L1), yet differ in morphosyntax.
For English articles, metalinguistic CF in both written (d = 2.09) and oral forms (d = 1.78) was very effective. Direct written feedback (d = 1.91) had a lower effect size than written metalinguistic feedback. Because one of the most complex aspects of article usage involves semantic and pragmatic understanding of how and when to use the feature, larger values for metalinguistic feedback may suggest that this CF technique is ideal for targeting semantic or pragmatic differences in morphosyntax. Higher effect sizes for metalinguistic written feedback for the indefinite article (d = 1.45) over direct written feedback (d = 1.07) appear to further confirm this assertion. With metalinguistic feedback, learners are not "fed" the answer. Instead, they must incorporate new information with existing knowledge to correct the form. Such a cognitively intensive form of corrective feedback may be ideal for getting learners to identify complex semantic or pragmatic relationships which may take more intensive thought to acquire. As with the past hypothetical conditional and English article, combined past and present perfect tenses reveal a large effect size for explicit written CF. Direct and indirect written feedback had values of d = 1.77 and d = 1.28, respectively. Explicit forms of feedback may be more effective when grammatical complexity increases. This view may explain why direct written feedback had the lowest effect size for the past simple tense (d = −.31) and the highest for the past hypothetical conditional (d = 2.76). On the whole, results suggest that explicit CF should be carefully chosen based upon characteristics of a target feature. Interestingly, oral recasts appear to have a more uniform impact. They remain relatively constant for different grammatical features, yielding effect sizes in a narrow range from d = .67 to d = 1.27. These recasts are also consistently more effective than elicitation of the past regular or past irregular tenses.
While information about proficiency was limited, some interesting relationships between level and grammatical feature emerged (Table 4 in Appendix). Overall, effect sizes based upon proficiency appeared to be variable, and did not show a consistent trend in efficacy or inefficacy of CF. When proficiency was considered along with the target feature, however, key patterns emerged. Learners at the high beginner level appeared to benefit from grammar emphasis of past tenses. All of the experimental groups for written direct (d = 1.97) and written indirect feedback (d = 1.28) at this level emphasized the past regular, past irregular, and present perfect tenses, yielding strong effect sizes. Oral metalinguistic feedback, recasts, and prompts, which included groups that exclusively targeted the past regular and irregular tenses, yielded moderate effect sizes of d = .24, .89, and .36, respectively. Results suggest that CF for the past tense is useful at the high beginner level and that oral feedback may provide less of a scaffold, explaining lower effect sizes for this type of feedback.
At the low intermediate level, emphasis of simple past regular and irregular tenses yielded negative effect sizes for written CF. All indirect writing feedback groups at this level emphasized regular and irregular simple past tenses, resulting in a negative impact (d = −.59). Direct feedback, which had an overall impact of (d = .48), included two experimental groups that emphasized past regular tenses (d = −.32) and two that emphasized anaphoric the (d = 1.27). Results suggest that emphasizing the past regular tense at an intermediate level may not have an optimal impact on accuracy in production. In contrast, CF with more semantically or syntactically complex features appears to have a higher impact. Written metalinguistic feedback, which had six experimental groups all emphasizing anaphoric the, had a high effect size (d = 1.17). Recasts at this level were represented by only one group that emphasized both participial adjectives (boring) and formulaic constructions like verb + pronoun + to (e.g., I want her to visit my place), yielding a large effect size (d = 1.27).
At the intermediate level, emphasis of the past hypothetical conditional and English article appears to be very effective. Direct, indirect, and metalinguistic written feedback all had experimental groups that targeted these features, yielding high effect sizes of d = 2.44, 1.43, and 2.93, respectively. Korean learners at this level also revealed a relatively large benefit from recasts (d = 1.18) and oral metalinguistic feedback (d = 1.78) that targeted the English article. As at the beginner level, oral forms of CF appear to have slightly lower effect sizes when the same grammatical feature is emphasized, which may reveal that written forms of feedback serve as a better scaffold. Results suggest that CF used with semantically and syntactically complex features at the intermediate level is effective. Thus, timely introduction of grammatical features may increase accuracy in speech or writing.
Essentially, differences in effect size by proficiency appear to be impacted by the characteristics of a target feature. Semantically and syntactically more simplistic past tenses appear to benefit from CF at a high beginner level, but not at a low intermediate level. Perhaps learners at higher proficiency levels have passed an opportune time for emphasis, lending support to findings within past research, which assert that grammar is teachable only when introduced at a specific stage of learner development (Dyson, 2018;Dyson & Håkansson, 2017;Pienemann, 1989). Smaller effect sizes for indirect written feedback of the English article at the high intermediate level (d = .38) may also suggest that a key time in learner development has passed for this feature, making pedagogical intervention ineffective. Overall, forms of CF appear to be more effective when a learner is ready to cognitively handle and acquire a feature. As revealed in this meta-analysis, however, very little is known about what grammatical features can benefit from CF at each stage of development. Only some levels of proficiency have been examined within past studies. For Korean learners, CF was extensively investigated at the intermediate level, yet neglected at early beginner and advanced levels. Furthermore, assessment of proficiency has not been systematic among researchers, limiting generalizability of results. More experimental research is needed to confirm or deny any relationships revealed concerning proficiency.
In addition to analysis by proficiency level, review of L1 similarity revealed key insights concerning different types of feedback (Table 5 in Appendix). For explicit CF, effect size was the largest when used with grammatical features that are both present in the L1 and morphologically and syntactically different from the L2 (direct feedback -d = 2.76; indirect feedback -d = 1.58; written metalinguistic feedback -d = 2.67). Explicit CF was effective, yet less effective, when used with grammatical features that are absent in the Korean L1 (direct feedback -d = 1.63; indirect feedback -d = 1.17; written metalinguistic feedback d = 1.94). The findings appear to confirm past research by Williams (1995), which suggests that explicit instruction is most necessary when grammatical features are present in the L1, yet have a key difference that cannot be implicitly learned from L2 input. Like explicit forms, implicit forms of CF appear to be more effective with L1 Present features. For recasts, these features had an effect size of d = 1.27, whereas Absent features had a value of d = 1.21. As with explicit techniques for CF, Very Similar features have the lowest effect size of d = .89. Overall, findings suggest that L1 Present features which have some kind of L1/L2 disparity benefit most from CF. As previously discussed, effects for oral recasts remain relatively stable despite L1 differences or similarities. Explicit forms of feedback, however, reveal larger differences in effect size between categories, which suggests that choice of explicit CF should be limited to grammatical features that are either present, yet different from the L1, or absent from the L1.

Conclusion
Results suggest that, although explicit forms of feedback (direct and metalinguistic CF) tend to be more effective, this efficacy is highly dependent upon the grammatical feature that is targeted, as well as both L1 and proficiency level of the learner. Direct, indirect, and metalinguistic feedback were not very effective with the past simple tenses, which are semantically and morphologically similar to the Korean L1. Because of the simplicity and similarity of the English past tense to Korean, implicit techniques like recasts may provide enough focus, while not cognitively burdening the learner. In addition, recasts provide phonological input, which may be most necessary for learning L1 similar features. In contrast to past simple tenses, metalinguistic feedback (both written and oral) was very effective for the English article, suggesting that this form of CF may promote learning of complex semantic and pragmatic concepts associated with grammar. Collectively, results suggest that explicit types of CF should be carefully chosen based upon characteristics of a grammatical feature.
Analysis of proficiency level suggests that timely introduction of grammatical features is a main determinant of the efficacy or inefficacy of CF. As learners increase in proficiency, they appear to benefit more from a focus on grammatical features that grow in scope or complexity. Proficiency designations in this study have clear limitations, but they do demonstrate how little is known about when specific grammatical features should be introduced to maximize acquisition, thereby illustrating a need for further research. Without such knowledge, the true efficacy or inefficacy of various forms of CF cannot be accurately assessed. As for L1 influence, a specific form of English grammar with a Korean equivalent that is somewhat different may benefit most from explicit instruction. This view appears to align with past research, which suggests the L1 differences in usage of an English grammatical feature may need explicit emphasis before a learner can recognize the problem (Williams, 1995).
While results of the study provide useful information for educators, there is a need for further experimental research. The current corpus of experimental studies has only a limited scope. For this meta-analysis, available studies were limited, and only concentrated on the same grammatical features (past simple tenses, English articles, and past hypothetical conditionals), which restricted understanding of how other features are influenced by CF. Since each grammatical form has its own distinct properties (phonological, semantic, and syntactic), more holistic experimental study is needed. Among the grammatical features studied, very few were examined in the context of both L1 and learner proficiency. Furthermore, very few grammatical features were studied using all forms of corrective feedback (implicit reformulations, explicit reformulations, implicit prompts, and explicit prompts).
There is a clear need for more meta-analysis and experimental research that comprehensively investigates multiple influences of CF (L1, proficiency, and grammatical feature type). By expanding study to include other languages and learners on a global scale, how to provide the right form of CF at the right time may finally become welldefined. Essentially, more comprehensive study is needed before CF may be optimized for the learner. To provide the holistic understanding required for educators, a corpus of small-scale experimental studies may be needed. Through using standardized measures of L1 similarity and proficiency levels, various small-scale studies of different grammatical features may be collated from across the globe, providing the holistic perspective that is needed. In any case, a need for more comprehensive understanding of how and when to use CF is readily apparent. Through further research, better methods to utilize feedback may be developed for both teachers and computer-based programs, which effectively tailor instruction to needs of the learner.