Using data-driven learning activities to improve lexical awareness in intermediate EFL learners

Abstract A large number of empirical studies have demonstrated the impact of lexical chunks on foreign language learning. However, reliable and efficient use of the lexical approach remains a problem. In this paper, a revised chunk teaching approach, based on Data-Driven Learning (DDL) activities and language awareness theory, is proposed and tested. Taking an action research approach, the effects of DDL activities on increasing the lexical awareness of intermediate EFL learners were investigated. A 16-week course, using the teaching pattern of EIODI (Expose-Identify-Observe-Discover-Internalize) along with recognition and production tasks, was used to teach 158 intermediate EFL first-year university students. Quantitative questionnaire data from an in-class survey and a university course assessment survey revealed that the learners increased their awareness of lexical chunks and developed positive attitudes toward this approach. In addition, they reported a positive impact on their English proficiency; claims were supported by a statistical comparison of Finals for College English Course test scores for the two succeeding academic years. This research highlights the importance of increasing learners’ language awareness and using effective DDL activities.

ABOUT THE AUTHOR Dr. Xue Liya woks as a language lecturer at the School of International Studies of Zhejiang University in China and as a researcher in the Institute of Linguistics and Applied Linguistics in Foreign Languages of Zhejiang University. Her research interests are second language acquisition and speech prosody. Previously she had done some research on prosodic disambiguation in second language at the University of Pittsburgh in USA as a visiting scholar. In recent years, she has been working on a teaching reform project on the promotion of language awareness in EFL classroom. It's found that lexical awareness can be effectively raised through Data-Driven Learning (DDL) activities, which is reported in this paper. Future work will be more focused on prosodic awareness in language education.

PUBLIC INTEREST STATEMENT
In foreign language teaching, intermediate students pose a big challenge to language teachers, partly because the students continue to have gaps in their language knowledge that need to be filled, and partly because their progress is less noticeable than that of beginners; thus, their motivation to learn tends to drop. In this paper, a revised chunk teaching approach, based on Data-Driven Learning (DDL) activities and language awareness theory, is proposed and tested. The results revealed that the learners increased their awareness of lexical chunks and developed positive attitudes toward this approach. This study highlights the importance of increasing learners' language awareness and using effective DDL activities.

Introduction
According to CEFR (Common European Framework of Reference for Languages) language proficiency level descriptors (as cited in Figueras, 2012), the majority of Chinese first-year university students are B1 (intermediate) or B2 (upper-intermediate) EFL learners. With ten or so years of English education at school, these students are familiar with the basic grammatical structures of English and with moderate English vocabulary and can usually "produce simple connected text on topics which are familiar or of personal interest" (B1-Intermediate), but have difficulty in "using language flexibly and effectively for social, academic and professional purposes" (C1-Advanced). To be specific, Chinese first-year university students often find themselves plateaued at a certain level of English competence, producing simple speech and using Chinglish expressions when writing. Many learners have difficulty breaking through this "plateau stage" (Chen & Cheng, 2018).
Intermediate students pose a big challenge to language teachers, partly because the students continue to have gaps in their language knowledge that need to be filled, and partly because their progress is less noticeable than that of beginners; thus, their motivation to learn tends to drop. Lewis (2000, p. 14) argues, "The reason so many students are not making any perceived progress is simply because they have not been trained to notice which words go with which. They may know quite a lot of individual words which they struggle to use, along with their grammatical knowledge, but they lack the ability to use those words in a range of collocations which pack more meaning into what they say or write". Here "collocations" also refers to other terms, such as lexical phrases, formulaic sequences, and lexical chunks. Lewis's statement explicitly explains why intermediate EFL learners fail to flexibly and idiomatically use the English language. Although in recent years, collocation learning has been the focus of research on vocabulary learning and teaching for second language learners (Wang & Yang, 2020), studies have repeatedly demonstrated the difficulties encountered by learners in collocation learning and use (Hsu & Chiu, 2008).
Traditional English language teaching in China approaches vocabulary and grammar separately and pays little attention to lexical items, which leads to the fact that "new word acquisition is not based on lexical grammar, so meaning is separated from colligation and collocation" (Li, 2018b, p. 498). As a result, learners' active vocabulary is often composed of a limited number of lexical chunks, as learners have not even been "trained to notice" these lexico-grammatical units. In other words, they have not developed an authentic awareness of lexical chunks. Therefore, despite years of hard work in English classes, their lack of vocabulary knowledge is often cited as one of the obstacles they struggle to overcome (Zheng, 2009).
To break the plateau of English learning of intermediate EFL learners caused by the lack of lexical chunks, this study experiments on a teaching approach based on data-driven learning activities and attempts to raise learners' lexical awareness. Wray (2002, p. 9) defines lexical chunks as "a sequence, continuous or discontinuous, of words or other elements, which is, or appears to be, prefabricated; that is, stored and retrieved whole from memory at the time of use, rather than being subject to generation or analysis by the language grammar". Originating in the field of psychology, chunking is considered the allimportant principle of human cognition, implying "the ability to build up structures recursively; thus leading to a hierarchical organization of memory" (Newell, 1990, p. 7). Some applied linguists argue that this concept can be applied to language acquisition and, rather than taking the traditional view that language is composed of grammar and vocabulary, it consists of lexical chunks, which are the essential mediator between conceptualization and grammatical and phonological encoding (Lewis, 1993(Lewis, , 1997Nattinger & DeCarrico, 1992;Leech, 1983;Levelt, 1989). Thus, the lexical approach proposed by Lewis (1993) focuses on teaching lexicogrammatical units and encouraging learners to perceive these units as a whole and store them in the mind as such. These units fall into four types: 1) polywords, such as by the way, in a nutshell, to sum up; 2) collocations, such as meet one's needs, striking increase; 3) institutional expressions, such as how do you do, nice meeting you; and 4) sentence frames, such as A new study suggests that . . . (Lewis, 1997). Since these chunks serve the basic units of human memory, they play an essential role in the learning process. An average native speaker knows hundreds of thousands of chunks, thus achieving a taken-for-granted degree of real-time fluency (Pawley & Syder, 1983).

Lexical chunks
In recent decades, there has been a growing recognition of the importance of lexical chunks in foreign language acquisition. Research has found a strong relationship between lexical chunks and second language skills. Ding and Qi (2005), for example, have found that the ability of second language learners to use formulaic language is a better predictor of oral and written English performance than grammatical accuracy. In other words, good language learners are good users of formulaic sequences. Boers (2006) has suggested that it is helpful for language learners to establish a set of formulaic sequences to improve their oral proficiency, while Mohammadi and Enayati (2018) have found that the input of lexical chunks helps to avoid errors caused by vocabulary selection and cultural differences, thus improving the accuracy of spoken English. Kazemi et al. (2014) have demonstrated that lexical chunks help students to improve their writing ability and argue that all writing courses should pay attention to the teaching of lexical chunks. Ilaria (2015) argues that "The most talented reader is the one who is able to identify the set of patterns which the words belong to and continually guesses the following item, in a steady dialogue between the so-called 'Encyclopedia', and the new information." Research by both Tang (2013) and Li (2018a) has revealed that the number of lexical chunks that a learner possesses is closely related to listening performance, suggesting that lexical chunks can improve the efficiency with which a learner processes language information. Although a number of empirical studies demonstrate the impact of chunk input on EFL learning, there has been limited research into how to reliably and effectively teach lexical chunks.

Language awareness
Language awareness refers to "knowledge about language", which encourages "the development in learners of an enhanced consciousness of and sensitivity to the forms and functions of language" (Carter, 2003, p. 64). A language awareness approach has been developed in contexts of both mother-tongue language education and second language acquisition. A key element of this approach is that learners "discover language for themselves" (Bolitho et al., 2003, p. 251). As a concept that has long played an important role in language education, language awareness is more often associated with more prescriptive approaches, characterized by, for example, the 1980s method of analyzing language form. It has since evolved beyond "knowledge about language", or a focus on language itself, to emphasizing the cognitive advantages of reflecting on language, and to the belief that learners' attitudes towards language and language learning can be changed by methods which highlight specific language features through learners' active involvement (Bolitho & Tomlinson, 1995). Therefore, language awareness emphasizes both language structure and language use, and considers both formalistic and communicative language teaching methodologies. More specifically, language awareness involves awareness of both comprehension and use of grammar and awareness of the use of language in practical communication (Yu, 2016). Traditionally, language awareness research has been confined to the study of grammar (Ellis, 1998;Valeo, 2013). This has also traditionally been the case in China (Pang, 1996;Peng, 1999). Meanwhile, awareness of the use of language in practical communication has been neither extensively researched nor developed. As "fixed or semi-fixed frequently-used syntactical structures with discourse functions" (He, 2016, p. 143), lexical chunks are units that can help learners connect language structure with language use. As Lewis (1993, pp vi-vii) notes, "A central element of language teaching is raising students' awareness of, and developing their ability to, 'chunk' language successfully." Therefore, the value of enhanced "noticing" and of "consciousness raising" of lexical chunks in second language learners should be emphasized.
However, although the traditional present-practice-produce (3P) mode of teaching may work for grammar, it is less successful in raising learners' awareness of the pragmatic function of lexical chunks in authentic contexts, leaving learners "void of sensitivity of those chunks which, though grammatically correct, do not conform to the idiomaticity of native use" (Yu, 2016, p. 14). Based on a corpus study (Wei, 2007), lack of pragmatic quality, including interaction, cooperation, courtesy, and appropriateness of discourse, is a major problem in the spoken English of Chinese university students. Pragmatic failure is an important cause of cross-cultural communication breakdown. Misshapen, hybrid language that might be described as "English with Chinese characteristics" is commonly found in EFL writing and translation, even among some who are highly trained and experienced (Pinkham, 2000). The reason for this is an inadequate store of pragmatic lexical chunks. Therefore, a revised approach to chunk teaching based on Data-Driven Learning (DDL) activities is proposed.

Data-driven learning
Data-Driven Learning is a term coined by Tim Johns, who stated that DDL is "the attempt to cut out the middleman as far as possible and give the learner direct access to the data" (Johns, 1991, p. 30). Through direct exposure to real-world contexts in the form of data, learners "gradually get aware [sic] of observing linguistic phenomena, analyzing and summarizing language features, and in this way, strengthen their language abilities" (Zhang, 2018, p. 107). In this paper, "data" include both attested uses of language recorded in real communicative contexts and authentic uses of language stored in language corpora and accessed predominantly through concordancers. Hassan (2018, p. 24) describes a corpus as a "collection of naturally occurring language stored on a computer and used to tell how language is used". Compared to the 3P mode of teaching, DDL focuses on authentic language exposure and student-centered exploratory learning.
Over the past three decades, a growing body of empirical DDL research demonstrates its positive effects on promoting learner autonomy, increasing language awareness, enhancing noticing skills, and extending learners' cognitive abilities (Boulton, 2009(Boulton, , 2010Lin & Lee, 2015;Luo, 2016). In recent years, more studies have been conducted on the role of DDL in the EFL classroom. For example, DDL activities have been proved effective in English preposition learning and increasing learners' grammatical knowledge (Boontam & Phoocharoensil, 2018), in the acquisition of Lexicogrammatical Patterns in EFL Writing (Yılmaz, 2017), in improving collocational competence (Hua & Azmi, 2021). However, to date corpus tools still have not "fully 'arrived' on the pedagogical landscape" and the practice of foreign language teaching seems to have been largely unaffected by advances in corpus research, with relatively few teachers and learners aware of the availability of useful resources and having personal access to corpora (RÖmer, 2010, p. 18). Besides the low awareness and inaccessibility to corpora, reasons may also lie in teachers' lack of detailed practice guidance of DDL activities, the difficulty in integrating DDL into the training of a certain language skill, such as lexical ability, the uncertainty of the effectiveness of DDL and students' attitudes towards DDL approach. Therefore, in this study a teaching pattern-EIODI (Expose-Identify-Observe-Discover-Internalize) with DDL activities is designed and practised to answer the following three questions: (1) ) Can the EIODI teaching pattern with DDL activities increase the learners' awareness of lexical chunks?
(2) ) Does the improvment of lexical awareness influence the learners' English proficiency?
(3) ) What are the learners' attitudes toward this DDL approach?

Context
Based

Population
Four parallel classes consisted of a total of 158 non-English majors, ranging in age from 18 to 20 years, of whom 58 were female (36.7%) and 100 were male (63.3%). The vast majority of students had studied English at school for ten years, while a small minority had only six years of school-level English. All students were Chinese, and none had reported learning disabilities . All students agreed to participate in this study.

Data used in DDL
Two types of data were selected for DDL in this study. The first type of data was iWeb corpus (https://corpus.byu.edu/iweb) released in May 2018. A number of features made this an ideal choice. This corpus is approximately 14 billion words in size, making it 25 times as big as COCA (560 million words). It allows the user to create a "virtual corpus" for any topic, and to access a wide range of searches, including words, phrases, substrings, lemmas, part of speech, synonyms, and customized wordlists. A unique feature of iWeb, which makes it very useful for language learners and teachers, is the ability to browse through a list of the top 60,000 words (lemmas) in the corpus, together with a wide range of information on each of those words . Appendix 1 contains an example of a word home page. Zhejiang University has officially purchased the iWeb corpus, so that students have free access. This affordance enables students to have direct interaction with the real uses of words, collocations, and clusters. The second type of data was the authentic materials in real communicative contexts, including TED talks (www.ted.com), news articles, The World of English magazine, original English language books, and New College English textbooks. Of these, TED talks, BBC news, and New College English VLS (Viewing, Listening and Speaking) were used for listening input, and The World of English, original English language books, and the New College English textbook were used for reading input. Given that interest may increase motivation to learn, students were given the freedom to choose some materials, while some were chosen by the teacher and related to textbook topics so that the learning process could be easily monitored. The students had easy access to all the above resources via the internet and the university library.

Procedure
During the first two weeks, the course aims were explained and key concepts, such as lexical chunks, language awareness, and DDL, were introduced. The students were then provided with basic information about corpus-based learning and were guided in the use of iWeb corpus.
For the remaining 14 weeks, an EIODI (Expose-Identify-Observe-Discover-Internalize) pattern of teaching was adopted. First, students were exposed to real language data. In addition to the use of textbooks, each week the students were required to listen to one TED talk chosen by the teacher relating to the textbook topic and one news report chosen by the students relating to their own interests. In addition, they were required to read one magazine article chosen by the teacher relating to the textbook topic and one chapter of an original book chosen by the students and relating to their own interests. Next, in the "identify" stage of learning, students were required to establish the communication breakdown in both the reading and listening tasks and make a list of the words they were most interested in learning. This process enabled students to develop sensitivity to the use of words in different language styles and in diverse contexts. In the "observe" stage, students were required to use the iWeb corpus to investigate these words by examining high-frequency collocates, clusters, and topics, so that when these words were clearly presented in real world contexts, the students could better perceive and comprehend the language habits of native speakers (see Appendix 2). In the "discover" stage, students were required to explore the rules and patterns of language use by reading the iWeb concordancers of these words and phrases through both direct corpus search and indirect corpus search, using concordance lines extracted by the teacher. Students were required to take note of patterned language (lexical chunks). In the "internalize" stage, students were required to participate in both recognition and production activities. Recognition activities, including cloze tests (see Appendix 3), paraphrasing, dictation, correcting and collocation matching, were used to test how well the students recognized the lexical chunks. Production activities involved summary writing (see Appendix 4) and public speaking. Students were required to write summaries of news stories and magazine articles and were assessed on their use of lexical chunks. Well-written summaries were shared on the QQ social media group of which all participants were members. Public speaking took the form of a group discussion in which students were required to use newly learned chunks from the TED talk to express their understanding and opinions of the talk.
The EIODI teaching pattern was used both in and outside of class. Textbook articles and textbook VLS materials were predominantly used in class. The teacher helped students to identify chunks and to develop sensitivity to the functions of chunks in diverse contexts and in various language styles (see Appendix 5). Direct concordancers from iWeb corpus were used as real-world examples and the students observed language patterns in order to improve their ability to perceive native English language use. Meanwhile, students were required to follow the EIODI pattern to learn from other authentic materials outside of class and to accomplish tasks, such as paraphrasing, correcting, collocation testing, public speaking, summary writing, and group sharing.

Data collection
Student feedback on course content, teaching quality, and overall learning experience has been found to be an important guide for course design and development, and plays a significant role in the delivery of high-quality, student-centered education (Leckey & Neill, 2001;Steyn et al., 2018). In order to acquire a clearer understanding of the role of DDL activities in increasing learners' awareness of lexical chunks, how language awareness influences English proficiency, and learners' attitudes toward this DDL approach, two anonymous questionnaires were used to collect data. The first was a feedback questionnaire on the teaching approach used (see Appendix 6), distributed by the teacher on the last day of class. The second was a course evaluation questionnaire conducted by the university administration center to examine student progress throughout the course.
The teaching approach feedback questionnaire consisted of 30 statements in four categories: awareness of lexical chunks, enhancement of language proficiency, difficulty using iWeb corpus, and attitude toward the pedagogical approach. Some items were adapted from existing research (Asuman et al., 2016;Geluso & Yamaguchi, 2014). This was a hard copy paper questionnaire and all statements were presented in English. The students were required to indicate their degree of agreement with each statement on a 5-point Likert scale. All questionnaires were completed and 158 valid sets of data were collected. Data analysis was conducted using SPSS 16.0. Table 1 -4 present results for number of participants, percentage of agreement based on the population choosing 4 and 5 (agree and strongly agree), agreement range (scale minimum and maximum), means, and standard deviations for each category. The university administration center's course evaluation questionnaire was conducted at the end of the semester. Table 5 reports the results obtained from Zhejiang University's educational administration system (http://jwbinfosys.zju. edu.cn).

Data analysis
As Table 1 demonstrates, students felt strongly about the general increase in their lexical awareness, as shown in the response to item one: My awareness of lexical chunks has developed (m: 4.54). Following intensive exposure to lexical chunks, including the introduction of basic information about the lexical approach, exploratory learning of lexical chunks through the iWeb corpus, and task-based activities, increased sensitivity to lexical chunks was reported. The statement with the second highest levels of agreement was My awareness of idiomatic collocations has developed (m: 4.42), matching item two in Table 2: Useful for finding idiomatic collocations (m: 4.54) and item one in Table 3: I trust the collocations I find in the corpus (m: 4.30). Item 2, My awareness of the importance of word frequency has developed (m: 4.37), also scored highly, as did item 6, My awareness of active participation in the learning process has developed (m: 4.22). The development of word frequency awareness may be closely related to exposure to the iWeb corpus, which highlights word frequency. Item 6 matched item 3 in Table 3: I enjoy being able to direct my own learning (m: 4.03). There was less agreement on statements such as those concerning word use in different registers (item 3, m: 3.8) and the difference between near synonymous phrases (item 5, m: 3.95). For many EFL learners, word use precision is a problem. DDL activities and the lexical approach offer new perspectives on learning vocabulary and more emphasis should be given to the in-depth study of words.  : 4.21), and using more idiomatic expressions in speaking (m: 4.04). Items 1-6 concerned vocabulary learning, and all achieved mean agreement greater than 4, with the exception of item 5, Helpful for understanding the difference between near synonymous phrases (m: 3.99). Items 7-10 concerned improvements to basic language skills, and all achieved mean agreement greater than 4, with the exception of item 10, Helpful for improving listening (m: 3.48).
With regard to the iWeb corpus, Table 3 indicates three noteworthy problems with its use: some learners felt that there were too many sentences (m: 3.46); some agreed with the statement that it  (<1), demonstrating a range of abilities in using the corpus, which implied that students with low self-motivation and greater dependence on guidance from the teacher might require more focused instruction on corpus use. Table 4 shows that all items scored a mean agreement of above 4, with the exception of item 2, Using the corpus has motivated me to learn words (m: 3.69). This indicates that most students had  (m: 4.16). Here, item 8 is a reminder that English proficiency levels among students may be an important factor contributing to the effectiveness of this approach. Item 7 shows that the lexical approach can help learners to understand that grammar and vocabulary are not necessarily acquired separately. Surprisingly, item 2, Using the corpus has motivated me to learn words, was the statement upon which respondents showed least agreement. Table 1 and Table 2 demonstrate that DDL helps students to increase their awareness of lexical chunks and improve their understanding of vocabulary. However, a question remains as to why the students did not feel strongly motivated to learn words. The reason for this may be attributed to the difficulties confronted in using the corpus, such as the students' belief that it was time-consuming and contained too many sentences. Table 5 provides further evidence of approval for the course design based on the course evaluation conducted by the university educational administration system. This evaluation removed both the top and bottom 5% of the population, resulting in an average satisfaction score for the course of 4.88. This reveals that, in general, students were satisfied with the course design.
In order to assess the students' perceptions of their improvements in English proficiency with their actual improvements, Table 6 compares Finals for College English Course test scores for two successive academic years, 2018-2019 (18-19) and 2019-2020 (19-20). Both groups consisted of first-year non-English majors at Zhejiang University, classified into Band 3 based on their College Entrance Examination English test score. Both groups took the same course taught by the same teacher, and both groups took final tests of the same difficulty level. The two groups differed only with regard to teaching method. A traditional 3P teaching mode was adopted in the 18-19 course, while the new Lexical Approach with DDL activities was trialed in 19-20 course. As Table 6  demonstrates, the 19-20 students, who were taught with the new method, outperformed the 18-19 students, who were taught with the traditional method. There was an almost 3% increase in average scores and, of the tested language skills, the biggest improvements were in vocabulary and listening, while reading and writing showed no noticeable improvement. These results support the students' perceptions of overall improvement in their English proficiency, especially with regard to vocabulary learning ( Table 2).

The importance of being "trained to notice"
The common language units traditionally taught to students are words, phrases, and sentences. For the teaching of grammatical structure, these units are useful because words and phrases are used to build sentences. However, these units may not be the best choice for practical language use, because EFL learners usually have little understanding of how words are put together in real use, even though they can construct grammatically correct sentences with a large vocabulary and according to grammatical rules. In this study, lexical chunks were introduced as salient units for language use in the belief that language is made up of chunks of meaning. When these chunks are put together, they produce coherent meaning, and only a small proportion of spoken sentences are entirely novel creations (Moudraia, 2001). Therefore, this lexical approach pays special attention to lexical chunks, including polywords, collocations, institutionalized utterances, and sentence frames and heads. "Instead of words, we consciously try to think of collocations and to present these in expressions. Rather than trying to break things into ever smaller pieces, there is a conscious effort to see things in larger, more holistic ways" (Lewis, 1997a, p. 204). As the words that co-occur in natural text with greater than random frequency, lexical chunks are not determined by logic. Instead, arbitrary clusters become the prefabricated units of language use. Nattinger (1980) suggests that language teaching should be based on the idea that language production is the piecing together of readymade units appropriate for particular situations.
However, English language education has, for a long time, failed to properly emphasize the important role of these ready-made units in language teaching and learning. The EIODI teaching pattern with effective DDL activities used in this study has raised EFL learners' lexical awareness through the identification of lexical chunks in reading and listening, the development of sensitivity of English collocation with adequate exposure to authentic language data in iWeb corpus as well as other tested materials, and the improvement of the confidence in language use in ecognition and production activities. Upon completion of the 16-week course, learners reported increased awareness of lexical chunks and a positive effect on their vocabulary learning, reading, writing, and speaking. In addition, students expressed a positive attitude towards self-directed chunk learning and a deepened understanding of grammar and vocabulary while the traditional 3P mode always makes Chinese EFL leaners " feel frustrated with remembering English words" for having inadequate vocabulary learning strategies (Ma, 2012(Ma, , p. 1199. Therefore, the importance of being "trained to notice" should be stressed in English vocabulary teaching.

Lexical awareness and language proficiency
For decades, researchers have explored the correlation between language awareness and language proficiency. James and Garrett (1992) argued that "language awareness needs no justification in terms of improvement in skill, just as biology does not have to prove that it has led to improved crop or stock production." As for foreign language learning, research has found that information processing speed is an important factor in determining the success of information reception in a foreign language. This ability, in turn, is entirely dependent on the readiness with which patterns are recognized, and meaning decoded, while the message is retained in the listener's short-term memory (Ilaria, 2015). Therefore, the quick recognition of language patterns This study examined the correlation between lexical awareness and language competency. First, the lexical approach increases learners' awareness of co-occurrence words which account for the majority of English text. An increase in the number of collocations and clusters learned through DDL activities helps students to reduce reading processing time. Second, corpus data increase learners' awareness of word frequency, enabling them to pay special attention to high-frequency words. An in-depth study of these words, including knowing high-frequency collocates, clusters, and closely related topics deepens learners' understanding of word use, which may lead to enhanced reading comprehension. Third, the increased awareness of both word frequency and word environment through DDL activities improves word discrimination and idiomatic collocation abilities, leading to a reduction in Chinglish expressions when writing. As J. Li (2018, p. 498) states, "when learners value lexical chunks and take lexical, grammatical and semantic occurrence seriously, their vocabulary knowledge of syntagmatic relations will be substantially improved."

Lexical awareness and language learning attitude
English teachers in China have, for a long time, used a traditional deductive approach, that focuses on teaching grammatical rules, sentence structures, and the use of words followed by examples of use. This approach is considered logical, efficient, and time-saving (Fischer, 1979;Hammerly, 1975). However, many students familiar with language rules and in possession of a broad vocabulary, fail to develop the ability to apply this knowledge to their own language use because their grammar and vocabulary have not been acquired through real language exposure. In other words, even those students with knowledge of the language and an awareness of language structure may find it difficult to use the language in real-life contexts.
Lexical awareness focuses on both language knowledge and language use, and encourages intentional and explicit reflection on language and language learning through the learner's active involvement. In this study, DDL activities were used for two purposes. First, through the use of selfaccessible authentic text in a concordancing program, DDL was found to improve the analytical power and independence of English learners (Asuman et al., 2016). This student-centered exploratory learning approach enables students to explore language use by observing authentic language data. Thus, learners become active participants in the learning process in which they discover and formulate the rules of language use for themselves. Second, DDL is a more interesting and impressive way of learning than memorizing words and language rules through direct teaching. This is inferred from the results of this study which indicates that most participants enjoyed selfdirected learning (74.7%) and most said they would continue to use this approach (76.3%). Thus, this approach enhances the autonomy of language learners and has a long-term effect on language development. Both of these are of great importance to intermediate and advanced learners who are more likely to lose the motivation to learn and who may have fewer opportunities to take formal language courses. This bottom-up approach is perhaps more suitable for intermediate learners, such as the participants in this study, and advanced learners, as they already Note: Among the 159 examinees, one student applied for exemption from taking classes and was, therefore, excluded from this study. Xue, Cogent Education (2021) possess the basic language skills, and when exposed to real language data, their understanding of the English language will deepen and their awareness of language use will improve.

Conclusion
This study examined the effects of data-driven learning activities on increasing lexical awareness among intermediate EFL learners. Following a semester-long College English Course, in which iWeb corpus and other real communicative materials were used, and EIODI teaching patterns were adopted for both in-class and after-class activities, students reported increased awareness of lexical chunks and a positive impact on English reading, writing, and speaking skills. Questionnaire data revealed that most learners developed positive attitudes toward this approach and were willing to continue using this approach in the future. Test results also suggest that the lexical approach with DDL activities adopted during the course has a positive effect on overall English proficiency.
One limitation of this study, however, is that the test results reported here compare the final test scores of two groups of students, while the progress made by the students involved in the study was not examined by way of pre-and post-course tests. Therefore, in order to substantiate students' perceptions of their improved abilities, appropriately designed pre-and post-course tests should be conducted in the follow-up study.

Implications
The findings of this action research have pedagogical implications for deeper and longer acquisition of lexical chunks. First, certain DDL activities prove effective in adopting lexical approach. In this study, the teaching pattern of EIODI was designed and at each stage, specific activities and tasks were carried out. to internalize lexical chunk learning. Pereyra (2015) found that improved knowledge of lexical chunks did not necessarily follow the acquisition of lexical chunks, and the blending of extensive reading with a lexical approach was encouraged in order to promote lexical chunk acquisition. In the current study, apart from recognition tasks, production tasks such as public speaking and summary writing were also used to help learners to internalize what they had learned. Unlike recognition tasks, such as dictation, matching, or correcting, production tasks require learners to flexibly use lexical chunks in their own speech and writing. Corpus research has already shown that foreign language learners face difficulties in freely using collocations and even controlled productive use can pose a serious challenge (Peters, 2015). Second, in this teachig experiment, the students' lexical awareness could be raised after intensive and explict exposure to chunks in real language use, but strengthening lexical awareness and promoting language use in the long term remain a challenge. For EFL learners, lexical chunks cannot be acquired as naturally and easily as those of their native language and more memory work and processing time are required. Therefore, in order to keep the learners' enthusiasm in DDL, two types of data should be organically integrated, that is, the use of corpora should be accompanied by real communicative contexts. In addition, new teaching strategies such as flipped classroom, the use of mobile application and even the use of humor (Nagy, 2020;Özcan & Kert, 2020;Suranakkharin, 2017) could be absorbed into the course design.