Introducing corpus linguistic tools to EFL undergraduates and trainee teachers

Many language teachers use Information and Communications Technology (ICT) in their classrooms to create tasks, quizzes, or polls with general online learning platforms. Few teachers have experience, however, of incorporating online corpus tools in their teaching or assessment practices. This paper will explore how autonomous learning can be fostered by gradually introducing freely available lexical databases, online collocation dictionaries, pronunciation guides, concordancers, N-gram extractors, and other text analysis tools for vocabulary building, skills practice, or selfchecking. Tasks used with English as a Foreign Language (EFL) undergraduates and teacher trainees on a Master’s Teaching English as a Foreign Language (MA TEFL) course will be presented. I will also explain why having some familiarity with linguistics research can enable teachers to use these applications more meaningfully.


Introduction
Teachers are increasingly aware of online tools that can support the teaching of EFL and are willing to incorporate a large number of them into their courses in higher education. These tools are used for many purposes, from supporting individual work to networking or sharing content. The use of online language reference tools based on corpus data is still in its infancy, though, both in English language teaching and in teacher training (Granath, 2009). At Karoli Gaspar University, EFL undergraduates and trainee teachers are offered a range of courses where corpus linguistic tools and techniques are used regularly, including study skills, advanced writing, business English, and a dedicated course for students on the MA TEFL module: Foreign Language Assessment Methods.
Advanced level language learners have specific needs that make them distinct from other learners. Learning languages at higher levels necessitates different learning strategies than learning at beginners' levels (Politzer & McGroarty, 1985). The main differences lie in vocabulary acquisition and production: having learnt most words within the core vocabulary range (the first 2,000 wordfamilies), they have to acquire words within a lower frequency band (Nation, 2006). This requires a more conscious effort on the learners' part to actively look for opportunities to learn more words, or read much more in order to facilitate incidental learning (Schmitt, 2000). If students are informed about word frequency lists, or other applications based on corpus linguistics, they can easily optimise their learning processes. Another advanced learner problem is finding the right collocations. A concordancer can help to produce multi-word expressions used by native speakers or competent L2 users. Students also need to be aware of diverse discourse characteristics to sound natural or appropriate in various communicative environments. Samples of these, again, are difficult to find in coursebooks.
Corpus linguistics research can help in the above areas by raising awareness, speeding up the learning process, and promoting learner autonomy (Gavioli, 2009). In language teaching and learning, three main types of corpora are studied most often: authentic corpora to observe language use (Szudarski, 2017), learner corpora to study interlanguage (Granger, 2002), and multilingual corpora for translation studies and contrastive analysis (Flowerdew, 2012). The aim of this paper is to present a variety of corpus linguistic tools based on these three types of corpora, which were incorporated into the syllabi of several university courses for EFL learners.

Method
The use of online tools was first incorporated into several language courses in 2018, as early as in the first semester. Table 1 lists some of these tools with their functions and their online location, however this paper will only discuss the first two in detail. In the first term, students were trained by demonstrating how each tool worked, explaining research findings which related to the tool, and by encouraging students to experiment with real data (their own texts, for instance); also, to compare their findings with research results. The use of corpus tools had two main aims: to facilitate the self-assessment of work, and to help students improve their writing and speaking skills. From the second year onwards, the main focus was analysing authentic original and translated texts and textbook language.
Students assessed their work in two main areas: receptive vocabulary knowledge and text production. The first tool they used was Vocabulary Profiler (lextutor.ca), which was used to analyse essays to find the ratio of academic words. Next, they compared their results with authentic text characteristics provided in the Typical Profiles section. The Vocabulary Size Test revealed the differences among students in their passive word knowledge. Doing the test stimulated a discussion on language learning styles and strategies, and discussing the results provided an opportunity to become familiar with some research into optimal vocabulary size for various purposes (Nation, 2006).
The process of using the tools to check individual texts went through these phases: drafting the text, familiarisation with the tool (e.g. what vocabulary levels exist, how to check words with the tool), analysing the text with the tool (e.g. types of vocabulary used, the ratio of K1, K2, and academic words), reflecting on, analysing, and peer-discussing the results, then finally, rewriting the text and submitting it.  (Biber, 1992) https://sites.google.com/site/ multidimensionaltagger/home

Results and discussion
Introducing corpus linguistic tools into various EFL courses resulted in an overall positive outcome, based on end-of-term student feedback 3 . During the 2. Common European Framework of Reference for languages 3. End-of-term course evaluation forms, unpublished raw data.
sessions, students were provided with numerous practice opportunities so that they knew the rationale behind the tools' use in teaching, and felt comfortable experimenting with them at home. With such scaffolding provided (Hubbard, 2013), corpus linguistic tools have become an essential part of the syllabus without an explicit focus on applied linguistics research. The students, according to their feedback, gained empirical evidence about the quality of their writing and were able to assess themselves, thus creating an atmosphere of involvement, interaction, individualisation, and independence (terminology from Dudley-Evans & St John, 1998, p. 200).
The heavily technology-enhanced L2 environment meant, however, that there were some anxiety issues at the start of the course that had to be overcome. Even though all students are surrounded daily by ICT, individual help was necessary for those who were new to monitoring language, focussing on syntax, or understanding grammatical terminology. Another important consideration is that for the successful use of corpus linguistic tools in such courses, the language teacher needs to be literate in corpus linguistic research so that results can be connected to existing corpus findings and language acquisition theories.

Conclusions
The use of corpus linguistic tools in a variety of courses at the university had not been considered previously: it emerged rather as a response to EFL learners' needs. The rationale behind introducing these tools was to raise language awareness and enable learner autonomy. The students' direct exposure to corpus linguistic tools occurred without offering an introductory course into the field. An active search to incorporate additional tools followed this phase, when it became clear that the students were motivated and involved in the explorations.
An important conclusion to be drawn from the end-of-term assessment of the courses was that scaffolding was essential twice: at the students' first attempts with the tools and when they wanted to interpret the results. This latter was essential particularly in the case of tools which only provided numerical data as results. These percentages and ratios lent themselves to discussion, and the teacher could direct the group to further readings about the features of the language analysed.
I hope that sharing these ideas will provide the language teaching community with potential corpus linguistic activities, and the widespread use of these tools will have a significant impact on traditional higher education language courses.