The Role of Opacity, Type of Instruction, and Competence Level

As a sub-type of Chinese idioms, chengyu 成语 play an important role in Chinese teaching. This study tested the effectiveness of etymological elaboration in relation to the type of instruction (direct or indirect), the degree of transparency of the chengyu, and the participants’ proficiency level. 48 post-elementary and 53 pre-intermediate learners were divided into three groups, one control and two experimental groups. The experimental groups attended to a 30-minute treatment, then a post-test was submitted to all the participants. Statistical and qualitative analysis revealed that indirect instruction was significantly more effective than direct instruction, independently from the participants’ competence level, and that opaque chengyu were more easily retained than transparent chengyu.


Introduction
Idioms are defined as multiword expressions consisting minimally of two words, including compound words, non-literal or semi-literal in meaning, and generally rigid in structure (Liu 2008, 23). Under the influence of generative grammar, figurative expressions have long been considered as a marginal component of language and, until recently, idiom teaching has been largely neglected in foreign language teaching (FLT). However, a large body of research provided evidence that building up an adequate repertoire of idioms is a necessity for foreign language (FL) learners of any competence level, especially for improving their communicative and socio-pragmatic competence (Liu 2008, 103-4).
Chengyu 成语 are the most renowned type of Chinese idiomatic expressions and play an important role in Chinese as a foreign language (CFL). In addition to their cultural relevance, many scholars (e.g. Guo 2017;Huang 2011;Shi H. 2007;Shi J. 2008) underlined that chengyu are very effective in communication and can help learners convey messages more successfully. Nevertheless, the importance of chengyu learning in CFL has been overlooked until recent times, and although the number of studies in this field is constantly increasing, pedagogically-oriented studies on chengyu teaching have been primarily descriptive, whereas experimental research is still at an early stage.
This study is an attempt to fill this gap by integrating Chinese scholars' proposals on chengyu teaching with the results obtained in experimental intervention studies on FL learners' acquisition of idioms. More precisely, it tests the effectiveness of etymological elaboration, a teaching technique based on the reactivation of the literal meaning of idioms, combined with the character-centred approach to Chinese vocabulary teaching. The ability tested is the active recall of the meaning of the target chengyu, and the effectiveness of the considered technique is measured in relation to the following variables: type of instruction (direct or indirect), degree of transparency of the target chengyu, and the participants' proficiency level.

2
Literature Review

Defining Chengyu
The issue of the classification of Chinese idiomatic expressions is still debated, and to date there is no definitive agreement among scholars on how to define chengyu (see Conti 2019).
Basing on Rosch's Prototype theory and Lakoff's Idealized cognitive models, Hu (2015) proposed a definition of the prototypical chengyu. According to the scholar, prototypical chengyu are conventional, lexicalized expressions characterized as follows: • Invariable four-character structure with a bipartite prosody (AABB); • Unitary, concise meaning with different degrees of semantic opacity; • Literary origins; • Formal register; • Lexemic function, 1 i.e. they are syntactically identifiable as a form class in the lexotactis.
From the semantic point of view, several scholars (e.g. Ni, Yao 1990) identified three types of relationship between the literal and idiomatic meaning of chengyu, corresponding to three degrees of transparency. 2 Transparent chengyu are completely compositional, that is, their meaning is obtained by summing up the literal meaning of each component. Semi-transparent chengyu are those whose idiomatic meaning is an extension of the literal meaning, that is, their metaphorical meaning can be metaphorically inferred. Lastly, opaque chengyu are those which display little or no relationship between their literal and metaphorical meaning. From the morphosyntactic point of view, the types of relations between the composing characters of the chengyu are varied, going from the simple juxtaposition of synonyms or antonyms to more complex structures, such as subject-predicate, predicate-object, predicate-complement, modifier-head, etc. (An 2016).
Lastly, the syntactic behavior of chengyu resembles that of single words or phrases, that is, they can function as any other syntactic component. Ni and Yao (1990) observed a correspondence between the morphosyntactic structure of chengyu and the syntactic role they can fulfil. Even though the majority of chengyu display one or few preferred syntactic functions (Xia 2009), it can be maintained that nominal chengyu mostly consist of modifier-noun sequences and can occur as subject or direct object, whereas predicative chengyu can be used as predicate, noun or verb modifier, verb complement, etc. As observed by Hu (2015), some chengyu are also used independently, functioning as linking words and, following Fernando's (1996) terminology, fulfilling a relational function.
1 The terminology adopted here is that of Makkai (1978).
2 The terminology used in this paragraph is that adopted by Moon (1998)

Difficulties in Chengyu Learning
The acquisition of idioms is no easy task for learners. Since they are figurative expressions that do not generally mean what they literally state, one of the major challenges for learners is comprehension. Liu (2008) classified the factors that affect FL learners' comprehension of idioms into two categories, namely, factors relating to idioms and their use and factors relating to language users. The first category includes frequency of use and degree of semantic and syntactic transparency, and the second includes age, cognitive style, L1, and competence level. The difficulty is further increased by the poor quality of exposure to idioms, which appears to mainly occur in non-interactive situations such as movies or television (Irujo 1986a(Irujo , 1993. Similar difficulties were also reported by the participants in the survey on chengyu learning and teaching conducted by Liu (2013). A prolific line of research examined the errors committed by CFL learners in chengyu use (e.g. Guo 2011;Hong 2003;Li 2011;Shi J. 2008;Shi L. 2008;Zhang 1999;Zhang 2006), finding that most of the errors concerned the form and meaning. Other errors involved the syntactic and pragmatic use, particularly the register. As noticed by many (e.g. Hong 2003;Shi J. 2008;Shi L. 2008), one of the main causes of syntactic errors is that learners fail to recognize the morphosyntactic structure of the chengyu.
Other causes of errors include cultural specificity and interlanguage interference, as well as the current teaching practices and materials. Cui (2008) and Huang (2011), for instance, observed that chengyu are underrepresented in the most common teaching syllabi and that the selection itself is not based on principled criteria. This led to a general lack of guidelines for practitioners and textbook editors, corresponding to a high degree of freedom both in the selection of chengyu for textbooks and in language teaching practices (Hong 2012; Lao 2009). Some scholars proposed difficulty scales for the graded selection of chengyu for teaching. Based on semantic criteria, Pan (2006) suggested presenting chengyu by increasing opacity, under the assumption of transparent chengyu being easier to learn than semi-transparent and opaque chengyu. Other scales such as Zhang's (2012) are more comprehensive, even though relevant factors such as frequency, syntactic analyzability, lexical complexity, or register are rarely considered.
As for teaching practices, Zhou and Wang (2009) observed that chengyu teaching is often limited to a brief explanation of the meaning of the expression or a translation into English, which is likely to lead to the occurrence of negative transfer. Such observations were confirmed in a recent study by Guo (2017), who interviewed 34 CFL practitioners and found that most of them did not consider chengyu

The Character-Centred Approach
In recent years, an increasing number of studies offered suggestions aimed to overcome learners' difficulties and provide effective instruction in chengyu teaching (e.g. Jiang 2012;Pan 2006;Xia 2010;Zhang 2006;Zhou, Wang 2009). However, as already mentioned, the techniques suggested in these studies are rarely supported by empirical evidence.
One of the few examples of quantitative studies on chengyu teaching is that conducted by Huang (2017), who investigated the effectiveness of deep-rooted cultural input (DrCI) on the ability of beginners, intermediate, and advanced learners of inferring the meaning of unknown chengyu. After a six-week treatment either based on conventional cultural input (CCI) or DrCI and four post-tests, Huang observed that overall DrCI was statistically more effective than CCI, and also assisted learners' retention of cultural information. The study also evidenced a significant impact of DrCI for learners at different proficiency levels, surpassing lack of linguistic knowledge and competence.
Many scholars recommend adopting a character-centred approach in chengyu teaching (e.g. Jiang 2012; Yang 2015; Zhang 2013). As reported by Wang (2000), the character-centred approach (zibenwei 字本位) was first introduced by Bellassen in his Méthode d'Initiation à la Langue et à l'Écriture Chinoises, as opposed to the word-centred approach (cibenwei 词本位) for the description and teaching of Chinese vocabulary. Given the word-building properties of Chinese characters, the character-centred approach is considered as particularly beneficial in CFL, as it can help learners broaden their vocabulary size quickly and autonomously (Jia 2001). An empirical study by Zhang et al. (2019) recently confirmed these assumptions, demonstrating that students can spontaneously develop word knowledge through exposure to print materials as long as they have sufficient morphological discrimination knowledge and metalinguistic awareness; therefore, the authors concluded that learners should be explicitly taught to use word-internal information to derive word meanings.
Based on these assumptions, scholars such as Yang (1996) and Zhou and Wang (2009)

416
[i]t would be a deficit-oriented approach to learn Chengyu as unanalysed chunks, without understanding their constituent parts and grammatical functions. (Guo 2017, 84) This is because, given the deep relationship between the literal and idiomatic meaning and between the morphosyntactic structure and the syntactic use, the analysis based on the character-centred approach is the key for fully mastering chengyu (Zhou, Wang 2009). In addition, given the high productivity of numerous morphosyntactic structures, the character-centred approach is also useful to increase the analyzability and predictability of chengyu meaning and to enhance learners' autonomy in chengyu learning (Xiang 2013). These assumptions, though not empirically demonstrated, are consistent with Wray's (2000Wray's ( , 2002 hypothesis on formulaic language learning. According to the scholar, learners' and native speakers' mental lexicons have different compositions, in that the former find it difficult to avoid analytic word-by-word processing of the FL/L2 and tend to acquire single-word units instead of ready-made multiword strings. In order to effectively teach formulaic sequences (including idioms), teachers should satisfy learners' need for analysis and look for a way of accommodating analyticity and formulaicity: the character-centred approach in chengyu teaching is apparently fit for this purpose.

Etymological Elaboration
The meaning of idioms is motivated, that is, its derivation from the literal meaning can be metaphorically explained (Gibbs, Nayak 1991). The research on cognitive metaphors (e.g. Gibbs 1994;Köveckses, Szabó 1996;Lakoff 1993) offered the possibility of presenting idioms in ways that promote insightful learning rather than blind memorization and inspired numerous proposals for teaching techniques that require learners' cognitive engagement with the target forms. Some examples include appraising the phonological or graphemic shape of word and phrases, making cross-cultural comparisons, grouping idioms on metaphorical bases, and more. 3 A group of studies explored the mnemonic potential of the imageability of figurative idioms, i.e. idioms that call up mental pictures. The studies conducted by Boers (2001), Boers, Demecheleer, Eyckmans (2004), and Boers, Eyckmans, Stengers (2007)  this stimulating learners' rich processing and engagement with the lexical items. In other terms, this technique invites leaners to use imagery by asking them to hypothesize about the etymological origin of idioms. In a first study, Boers (2001) tested etymological elaboration on 54 EFL learners divided into two groups. Both groups were given a handout with ten idioms and the task to explain their meaning. The participants were also given an extra task, which consisted in supplying a possible context in which the idiom could be used for the control group, and supplying a possible origin of the idiom for the experimental group. Two follow-up tasks measured the participants' retention of the form and the meaning of ten idioms, respectively. The comparison of the results was statistically significant, with the experimental group outperforming the control group in both tasks.
In the 2004 and 2007 studies, the scholars implemented the etymological elaboration in a computer software including three types of exercises, a multiple-choice task, asking to select the more plausible origin of the idiom; a comprehension task, asking to select the correct meaning of the idiom; and a fill-in-the-gap task which stimulated meaning recall. In both studies, the tasks were submitted to the participants in different orders. After the analysis of the results, the authors observed that tackling the identify-the-source task prior to the identify-the-meaning task generally led to better recall. In other terms: students who had been given the opportunity to use etymological information to try and figure out the idiomatic meaning of the expressions seemed more likely to remember the expressions than students who had perhaps resorted to blind guessing. (Boers, Eyckmans, Stengers 2007, 53) Therefore, etymological elaboration proved particularly effective when an inductive teaching approach based on problem-solving activities was adopted.

Research Questions
Based on what discussed in the previous sections, etymological elaboration seems particularly suitable for chengyu teaching for at least two reasons: 1) the origin of chengyu is well documented and their etymology can be tracked down in most cases; 2) the literal interpretation of the idiom meaning satisfies the assumptions of the character-centred approach to chengyu teaching. Therefore, the aim of this study is that of applying etymological elaboration to chengyu teaching and testing its effectiveness. The analysis of the literal meaning of the chengyu will be carried out adopting the character-centred approach. The research questions for this study are the following: 1. Is etymological elaboration combined with the character-centred approach effective for the comprehension and retention of the target chengyu (TC)? 2. Is there any difference between the direct/deductive approach and the indirect/inductive approach? 3. Does semantic transparency affect the participants' comprehension and retention of the target chengyu? 4. Do results depend on the participants' competence level?
The study adopts a QUAN + qual post-test only quasi-experimental design, with post-test score as the dependent variable and type of exposure to the input, type of instruction, TC transparency, and participants' competence level as the independent variables.

Participants
The participants in the experiment were 101 Italian second-and third-year bachelor students of Chinese coming from two different institutions, the University of Naples "L'Orientale" and the University of International Studies of Rome -UNINT. In each course year, intact classes were randomly assigned to one of the three conditions (Table 1). In order to measure the proficiency level of the participants, the reading part of the Chinese Proficiency Test (HSK) levels 2 and 3 (elementary and pre-intermediate) was submitted to a randomly selected sub-sample (88 participants in total) from both institutions. It was then assumed that the measured proficiency levels were representative of the average level of the entire sample. Year 2 satisfactorily completed HSK 2 but did not obtain sufficient scores in HSK 3, whereas year 3 satisfactorily passed both tests. It can thus be concluded that at the time of the test, the participants in the two years

TC Selection
In order to guarantee learnability and minimize between-item variability, it was established that the selected TCs must satisfy the fol- The final selection consisted in the 6 TCs shown in Table 2.

Material and Treatment
For the teaching session, a short dialogue composed of two or three turns was created for each TC. The vocabulary and the grammar of the dialogues was maintained within HSK level 2. The turns containing the TCs were retrieved from entries of the CCL Corpus, partially adapted to match the competence level of the participants. The turns not containing the TCs were created ad hoc. A native speaker of Chinese reviewed all the dialogues. The dialogues were then inserted into two PowerPoint slide-show presentations, one for each teaching approach. The contents of the two presentations were the same, and consisted in three parts: 1) a general description of prototypical chengyu, including information on their form, origins, and the various relationship between morphosyntactic structure and word class, and between literal and idiomatic meaning; 2) the dialogues with the TCs in bold; 3) six slides presenting each TC in isolation and providing information on their form, meaning, and syntactic use.
The presentations for the two types of instruction only differed in the order of the contents, following a top-down progression for the direct approach and a bottom-up progression for the indirect approach. As the direct approach proceeds from the general rule to the exemplification of the specific cases (Ellis 2005), in the presentation for the direct group (DG) the contents were ordered as follows: general description of chengyu > description of the TCs in isolation > exemplification through dialogues.
On the contrary, in the indirect or discovery approach, leaners: are provided with L2 data that illustrates the form and are asked to work out how the form works from themselves. (Ellis 2005, 717)  Therefore, the presentation for the indirect group (IG) was inverse, going from the data to the general rule. Thus, the order of the contents was the following: Dialogues > description of the TCs in isolation > general discussion on the characteristics of chengyu. The teaching sessions for the treatment were conducted during regular class hours by the author of the study. The duration of each session was approximatively 30 minutes. In the teaching session for the DG, the contents of the PowerPoint were entirely presented by the researcher without interaction with the participants. Special stress was laid on the metaphorically motivated relationship between the literal and the idiomatic meanings, and between the morphosyntactic structure and the word class.
In the teaching session for the IG, the participants were randomly divided into small groups. After showing each dialogue, they were given one minute for discussion. The request was to infer the meaning and word class of each TC based on word-internal (morphemes) and word-external elements (syntactic and contextual cues). The groups were also asked to provide a possible motivation to the relationship between the literal and idiomatic meaning. The slide describing the TC in isolation was then showed to the groups in order to provide feedback, either confirming or correcting their inferences. The teaching session ended with the discussion on the general characteristics of chengyu.

Post-Test
The post-test was administered one week after the treatment. The duration was approximately 20 minutes. The TCs in the test were randomly distributed (see Table 2). The test consisted of six items corresponding to the six TCs. Each item comprised four tasks: two translation tasks (t1 and t3) and two multiple-choice tasks (t2 and t4). In t1, the participants had to provide a literal translation for each character composing the TC. In t2, they had to indicate if the TC was transparent, semi-transparent or opaque. In t3, they had to provide an explanation of the figurative meaning or a rephrasing of the literal meaning in the case of transparent TCs. Lastly, in t4, they had to indicate the word class of the TC and to provide a motivation for their choice. In t1, each correctly translated character was assigned 1 point. The translations in t3 were rated 0 points for missing or totally inaccurate translations, 1 point for partially accurate translations, and 2 points for accurate translations. The multiple-choice items (t2 and t4) were assigned 1 point for correct answers and 0 points for missing or wrong answers. No score was assigned to the motivations in t4.
Two independent raters evaluated the tests. Two-way intra-class correlation coefficient (ICC) was calculated for inter-rater agreement

Statistical Analysis
The descriptive data of the post-test results are reported in Table 3. Table 3 Descriptive data for post-test results*

Group Tot (SD) TC1 (SD) TC2 (SD) TC3 (SD) TC4 (SD) TC5 (SD) TC6 (SD)
2-CG 2,50 (2,8) 0,33 (1,1) 0,00 (0,0) 0,00 (0,0) 0,25 (0,9) 1,92 (2,5) 0,00 (0,0)  Overall, the groups that performed better were the IGs in both years, whereas the scores obtained by the CGs were the lowest in all cases (see also fig. 1). The TC which obtained the highest scores was TC1, closely followed by TC5, whereas the TCs that obtained the lowest scores were TC3 and TC6. The effect of the instruction on year-2 groups was measured by means of robust one-way ANOVA. 5 The results of the omnibus test are statistical: F (2; 17) = 166,72; p = 0,00 (< 0,05); ξ̂ = 0,82. This means that, overall, the treatment was effective. The same procedure was followed for year 3. The robust ANOVA test is again significant for p < 0,05: F (2; 17,99) = 237,86; p = 0,00; ξ̂ = 0,83. Robust post-hoc test confirmed that the difference between the three experimental conditions is significant in both years, with the IGs obtaining the highest scores (Table 4, p crit = 0,017). The better performance of the IGs over the DGs is also evident if the overall results for each TC type (transparent, semi-transparent, or opaque) and each task are considered. As confirmed by the results of the robust independent-sample t-tests conducted on the data reported in Table 5, the comparisons between the two groups in both years are always statistical (p < 0,05), with strong effect sizes in most cases except one medium effect size in year 2, t2 (Table 6). As the data sets of this study rarely meet the assumptions of parametric statistics, and due to the consistent presence of outliers, robust statistics (20% mean-trimmed) was used for the estimate of group difference. All statistical tests were performed in R using the WRS2 package. The reported effect size is the explanatory measure of effect size ξ̂ for non-homogeneous variances. Values of ξ̂ = 0,10; 0,30 and 0,50 correspond to small, medium, and large effect sizes, respectively (Mair, Wilcox 2019;Wilcox, Tian 2011  Robust repeated measures ANOVA was used to compare withingroup results for each TC type. The test was not significant for year-2 and year-3 CGs (p > 0,05), suggesting that in these groups semantic transparency did not play any relevant role for the comprehension of the target items.  Table 7.   Table 7 shows that the difference between transparent and non-transparent TCs is always significant, with transparent TCs obtaining the lowest scores. The difference between semi-transparent and opaque TCs is significant in the 2-DG and the 3-IG, whereas it is not significant in the 2-IG and the 3-DG. In the comparison between semi-transparent and opaque TCs, however, the explanatory measure of effect size is always small except in 3-IG. This suggests that the magnitude of the comparison is significant only in this last case, while it is negligible in the others. The explanatory measure of effect size is large in all the other comparisons. Lastly, robust two-way ANOVA was computed to investigate how competence level (year), type of instruction (direct or indirect), and the interaction between the two factors affected the post-test total scores. The main effect of year (F = 2,26; p = 0,145) and the interaction effect between year and type of instruction (F = 0,30; p = 0,59) on post-test results are negligible (p > 0,05). On the contrary, the main effect of the type of instruction is significant, F = 46,79; p = 0,001 (< 0,05). This indicates that the overall results of the post-test did not depend on the competence level, nor that years 2 and 3 were differently affected by the type of instruction received. The only relevant factor that determined the differences in the total score is the type of instruction, with the IGs obtaining better results regardless of the competence levels of the participants.

Qualitative Analysis
From a qualitative perspective, the two CGs are characterized by the nearly complete absence of responses. Some literal translations were attempted, but the accuracy rate was low.
The two DGs are also characterized by a large number of missing responses. In the translation task (t1), over-generalization errors in the literal translations were very frequent and mostly regarded characters which were already familiar to the participants: an example is dao 道 'to say' (TC4), which is part of the compound zhidao 知 道 'to know' and was erroneously translated as 'sapere' ('to know'). Another relatively frequent type of error consisted in what Richards called "false concepts hypothesized" (Richards 1974, 178), which derive from faulty comprehension of distinctions in the target language. For instance, liang 两 'two' (TC5) was translated as 'quattro' ('four') in six cases (three in year 2 and three in year 3), due to its graphic resemblance with si 四 'four'.
In t2, there was a clear tendency to indicate the TCs as non-transparent. As a result, the explanations provided in t3 to the idiomatic meanings of the transparent TCs are often metaphorical or 'proverb-like'. In TC3, for instance, partially accurate explanations like 'vedere tanto insegna tanto' ('seeing much teaches much') and 'chi vede molto conosce più cose' ('who sees much knows more things') clearly resemble the form of Italian proverbs.
Another tendency observed in TC5-t3 consisted in only providing the vehicle of the metaphor which motivates the expression ('the coffin') instead of explaining its metaphorical meaning ('unexpected misfortune or death'). 6 As a last remark, when explaining the metaphorical meaning of TC1, the participants often recurred to equivalent Italian expressions such as 'avere le mani bucate' (lit. 'to have holed hands') or 'essere uno spendaccione' (lit. 'to be a money-waster'), suggesting a strong effect of L1 transfer.
The responses to t4 are very scarce and the accuracy degree low, especially in 2-DG.
In the IGs, the number of responses was larger than the other groups, and the degree of accuracy was higher. In addition, the occurrence of over-generalization errors was sensibly reduced: in year two, for instance, the critical character dao 道 was translated as 'sapere' ('to know') in only 5 cases out of 14.
6 According to the free online encyclopedia Baidu baike 百度百科 (https://baike. baidu.com/item/三长两短, accessed 2019-10-15), the literal meaning of san chang liang duan 三长两短 (Table 2) refers to the wooden boards composing a coffin, excluding the top lid. Though not attested in the other consulted sources, this explanation seemed suitable to activate the mental imagery required by the etymological elaboration. Like the DGs, the IGs also tended to consider the TCs as non-transparent (t2), even though the degree of accuracy of the explanations provided in t3 was generally higher. Nevertheless, some cases of proverb-like explanations of the transparent TCs are also attested.

Sergio Conti
Another difference worth noting is that in TC5 only one participant limited the explanation to the vehicle of the metaphor. Most of the remaining responses are either accurate or partially accurate. Some examples are 'essere in fin di vita' ('to be on your deathbed') and 'stare con un piede nella fossa' (lit. 'to have a foot in the grave'). Like the DGs, the participants in the IGs recurred to Italian idioms to explain the meaning of the TC, confirming the general reliance on L1 transfer whenever feasible.
Lastly, the responses to t4 were more numerous than the other groups, and the motivations more accurate. Concepts like 'head of the phrase' and 'syntactic function' were frequently mentioned, suggesting that the benefits of the instruction are also reflected in the participants' metalinguistic knowledge.

Effect of Instruction
Both the quantitative and qualitative analysis of the collected data suggest that the performance of the experimental groups was significantly better than the controls. This leads to the conclusion that explicit instruction -either direct or indirect -did have an effect on the comprehension and retention of the TCs. The high percentage of missing responses in the CG tests confirmed that, as assumed, the TCs were not previously known by the participants, thus validating the results of the test. On the other hand, all the constituent characters were already familiar to the participants: it was thus expected that the CGs would be able to attempt some answers, at least for the literal translations. Presumably, prior knowledge on chengyu and their difficulty might have inhibited the participants from attempting any translation of the meanings of the TCs, even though they were already familiar with the single characters. Without the literal meaning, the participants in the CGs did not have any base for activating the comprehension strategies that characterize the heuristic approach to L2 idiom interpretation described by Cooper (1999), nor were they able to infer their meaning or word class.
On the contrary, the participants in the experimental conditions seem to have benefited from the didactic interventions despite the short duration of the teaching sessions and the relatively long distance between the treatment and the post-test. The treatment was also beneficial for the memorization of the critical characters, and this confirmed the effectiveness of the character-centred approach. Character-by-character analysis not only mitigated the influence of the critical characters over the comprehension of the literal meaning of the TCs, but also helped the participants adjust their previous knowledge accordingly. These results are consistent with those obtained in previous studies on idiom and collocation learning (e.g. Hsu 2010; Kasahara 2011). As demonstrated by Zyzik's (2011) findings, learning idioms with unknown lexical items is not necessarily more difficult, as the extra cognitive step required for learning idioms with unknown constituent parts is minimal and easily overcome during the learning process. In the present study, the required cognitive effort was even more reduced, as the constituent characters were not completely unknown, and the participants only had to readjust their prior knowledge.

Effect of the Type of Instruction
Despite both the DGs and the IGs overperformed the CGs, the IGs obtained better results both from a quantitative and a qualitative perspective. It can thus be concluded that indirect instruction was more effective than direct instruction.
The inductive approach not only was effective for the retention of the meaning of the target structures, it was also beneficial for metalinguistic awareness, allowing the participants to notice salient structures and to express their metalinguistic knowledge more precisely.
The factor that better explains the performance of the IGs is the higher cognitive effort required in discovery learning. According to Boers, Demecheleer and Eyckmans (2004, 72), it is possible that a problem-solving task requiring students to try and infer the meaning of an idiom via its etymology and then verify (or falsify) their interpretation involves deeper processing than rote learning, and this may be beneficial to retention. These activities allow learners to focus their attention on the salient elements of the input and improve their analytic abilities for the comprehension of the target forms, while stimulating their autonomy and favouring their engagement in the heuristic approach to idiom comprehension (Liu 2008).
The participants in the IGs also benefited from collaborative learning. According to constructivism (Jonassen 1994;Vygotskij 1980), knowledge is the result of active social collaboration and negotiation. In other terms, the interaction with others plays a primary role in the co-construction of meaning from experience. In Vigotskij's terminology, acquisition occurs when learners are placed in their zone of proximal development, defined as the region between what they are capable of doing independently and what they have the potential to do under the guidance of or in collaboration with peers. In vocabulary learning, negotiation, which involves working out the meaning of a word through discussion, provides all the conditions needed for effective learning, namely, interest, understanding, repetition, deliberate attention, and generative use (Nation 2005, 585). It can thus be hypothesised that working in groups, between-group discussion, and instructor's feedback just provided the IGs with the beneficial conditions discussed so far.

Effect of the Degree of Transparency
The answer to RQ3 is less straightforward. The results of the RM ANOVA and the post-hoc tests seem to suggest slightly different scenarios in the two years. In brief, the transparent TCs resulted as the most difficult to retrieve in both years and conditions, whereas apparently semi-transparent and opaque TCs did not show any regular pattern in terms of easiness to memorization and retrieval. These results are in contrast with previous assumptions on idioms' and chengyu's easiness to learn, as the cognitive advantage of non-metaphorical expressions alone did not seem to correspond to a reduced difficulty. A possible explanation is provided by the Levels of Processing and the Dual Coding theories, on which the etymological elaboration is based (Boers, Demecheleer, Eyckmans 2004;Boers, Eyckmans, Stengers 2007). Compared to transparent chengyu, non-transparent chengyu require more engagement with the linguistic data, as the identification of the metaphoric themes behind them involves a higher degree of cognitive effort (Boers 2001). This satisfies the assumptions of the Levels of Processing theory, according to which the greater the cognitive involvement load, the greater the chances for the retention of linguistic information are (Cermak, Craik 1979). At the same time, according to the Dual Coding theory, verbal information associated with mental imagery leaves more durable traces in the long-term memory (Paivio 1986): as figurative chengyu are likely to call up a mental picture, it can be assumed that they have a higher mnemonic potential.
As for semi-transparent TCs, the mixed results might depend on the fact that the evoked imageries are perhaps less vivid than those evoked by opaque TCs and thus less memorable. Similar conclusions have been drawn in several studies on idiom learning, particularly those investigating the effectiveness of this cognitive semantic approach to teaching phrasal and prepositional verbs based on orientational metaphors (Kövecses, Szabó 1996).
Summing up, in instructed conditions, a difficulty scale based on chengyu transparency can be hypothesized. At one end of the scale are transparent chengyu, which require less cognitive effort and do not call up any mental imagery, thus leaving feeble mnemonic traces. At the other end are opaque chengyu, which evoke vivid images and require deeper processing, resulting in more durable mnemonic traces. Between the two are semi-transparent chengyu, which can still benefit from etymological elaboration, even though with mixed results due to their weaker imageability. Lastly, it cannot be excluded that L1 transfer played an important role, especially on the TC da shou da jiao 大手大脚, lit. 'big hand big foot' (Table 2). In fact, the TC and the Italian idiom 'avere le mani bucate' (lit. 'to have holed hands') are based on similar metaphors and share the same communicative function (describing someone who is exceedingly wasteful). This might have constituted an additional aid to the participants, demonstrated by the fact that the Italian idiom occurred very often in the responses to TC1-t3. To date, however, the benefits of L1 transfer in idiom comprehension and learning is yet to be demonstrated, and several studies reported divergent conclusions (Abdullah, Jackson 1998;Bulut, Çelik-Yazici 2004;Irujo 1986b;Taki, Soghady 2013).

Effect of Competence Level
Concluding, the last RQ asked if the effects of etymological elaboration in chengyu teaching vary according to the participants' competence level. Although the test scores obtained by year 3 were generally higher than year 2, the results of the robust two-way ANOVA test clearly suggest that the main effect of year and the combined effect of year and type of instruction are not significant. Etymological elaboration combined to the character-centred approach proved effective for upper-elementary and intermediate learners alike, provided that the selected chengyu and the context in which they are presented are adequate to lower-level learners, both from the lexical and the grammatical standpoint. The only factor that distinguishes the participants in the two years is the type of instruction received -direct or indirect.

6
Pedagogical Implications, Limitations, and Conclusions As observed by Guo: given the importance of Chengyu and the fact that learners at all proficiency levels will almost inevitably be faced with Chengyu in any encounter with native Chinese speakers […] it is beneficial to lay the foundations for Chengyu acquisition as early […] as possible. (Guo 2017, 101) The present study demonstrated not only that chengyu teaching is feasible at earlier stages, but also that, under certain conditions such as adequate lexical and grammatical complexity, lower-level learners can obtain the same benefits as higher-level learners.
The results of this study also contradict previous assumptions concerning the difficulty of chengyu by showing that transparent chengyu are not necessarily easier than non-transparent ones. If associated with indirect instruction and cognitive-mnemonic techniques such as etymological elaboration, semantic opacity can be an aid to memorization and retrieval and can thus facilitate acquisition.
This study has several limitations, which correspond to a number of interesting issues for future research. First, whether imagery processing is beneficial for the retention of the form of the chengyu as well as their meaning was not demonstrated. In fact, this is still an area of debate, and the results obtained in different studies on idiom learning are contrasting (Boers et al. 2009;Szczepaniak, Lew 2011). Second, how to better teach transparent chengyu also needs further investigation. Teaching techniques that proved effective for the acquisition of non-idiomatic and non-transparent formulae might result in better outcomes (Boers, Lindstromberg 2012). One possibility might be to focus on phonetic regularities such as rhyme, alliteration, or tonal alternation (An 2016). Third, the influence of L1 transfer and chengyu morphosyntactic structure surely deserves deeper inquiry. They were only indirectly touched on in this study, but there are good reasons to hypothesize that these aspects might play a relevant role in chengyu learning. In particular, some preliminary evidence seems to suggest that L1 transfer may have a negative effect in the interpretation of unknown chengyu (Conti 2017). Fourth, the longitudinal effects of the etymological elaboration were not tested. Guo (2008) found that this technique proved effective for the long-term retention of English idioms; however, there is no evidence that this is also the case of chengyu learning.
Lastly, according to Liu (2008), a solid grasp on idioms involves a command on all the three key aspects of idiom knowledge, that is, form, meaning, and use. This study only focused on the recall of the meaning of the TCs. Other aspects of chengyu knowledge which surely need to be addressed in future studies include production and depth of chengyu knowledge, especially for what concerns register, connotation, and use. However, the comprehension of idiom meaning plays a crucial role in idiom acquisition because a learner needs to understand an idiom before acquiring it. Thus, helping students understand idioms should be the main focus of idiom instruction. (Liu 2008, 126) Despite its many limitations, this study was still able to demonstrate that etymological elaboration and discovery learning did help learn-