Fostering learning with the EcoLexicon corpus in the ESP classroom

This pilot study provides a preliminary account on students’ attitudes toward using specialized corpora in English for specific purposes (ESP) classes. Learners (N=39) were introduced to the EcoLexicon corpus and trained to use its basic query tools. The rationale behind this activity was to introduce learners to contextualization patterns and genre-specific features of the professional target language, which in its turn would ensure acceptability and appropriateness of their linguistic choices. The learners were offered a series of guided and independent tasks on terminology disambiguation and corpus-assisted speech production. At the end of the semester a survey was administered to the students to assess their perception of hands-on corpus experience. Descriptive statistics show preliminary evidence that corpus tools provide illuminating data, foster understanding of nuances within synonymous groups of words, and increase overall language awareness. However, hands-on Data Driven Learning (DDL) experience presented a few challenges which, however, may be remedied by careful design of teaching materials and assignments.


Introduction
Introducing corpora in language learning draws on the DDL (Johns & King, 1991). Various tools and methodologies are widely used in DDL, this list includes but is not limited to language for specific purposes, frequency lists and learner corpora, error correction and contrastive analysis, corpus use in syllabus design, etc. (Boulton, 2017). According to meta-analysis of quantitative DDL studies by Boulton and Cobb (2017)  the instructor's side, fosters efficient language acquisition and develops students' analytical and problem-solving skills as well as learner autonomy (Vyatkina & Boulton, 2017).
Potential limitations of DDL are associated with the complex interface of many available corpora, as they were designed by linguists for linguists; therefore, introduction of corpus query tools to non-linguist students requires preliminary training. Apart from that, numerous examples derived from large corpora might be misleading for learners. On the instructor's side corpus-based pedagogic design requires considerable preparation time. Having said that, from the learners' perspective the level of language proficiency might be an obstacle to direct implementation of corpus data, as the query output needs to be carefully tailored and softened for novice learners.
Today DDL research falls into three major categories: • learner corpora research (analysis of learners' oral and written production); • corpus-based pedagogic material design; and • inductive learning (hands-on experience of learners with existing or specifically designed corpora) (Boulton, 2017).
The intention of this project was to investigate if specialized language corpora belong in an ESP classroom, what their perceptions are of the hands-on DDL experience and to discuss potential limitations as well as the ways to address them.

Method
This study collected preliminary data regarding learners' (N=39) attitudes toward using a freely available corpus as a lexicographic reference tool in their ESP/ translation studies classes. All participants were offered preliminary training on Sketch Engine basic query functions. This was followed by a series of guided search activities aimed at key terminology disambiguation. At the final stage the search activities were assigned as weekly homework with on-site follow-up discussion.
At the end of the semester the students were requested to respond to an anonymous questionnaire grading their experience, with open answer options. Microsoft Excel was employed to analyze data using descriptive statistics.

Participants
39 RUDN university students (median age=19) took part in this project; their English proficiency is B2-C1 CEFR, according to the results of Cambridge English exams. All participants are environmental sciences majors. All students are enrolled in double diploma programs and minor in specialized translation.

Instruments and procedure
During the spring semester 2018-2019, three groups of students aged 18-20 were offered to use EcoLexicon online tool as a lexicographic reference source in cases when bilingual and monolingual dictionaries failed to provide clear understanding of meaning or usage differences between near-synonymic words.
EcoLexicon is a corpus of contemporary environmental texts, the size is 23 million words and it is an extensive terminological knowledge base on the environment (León-Araúz, Martin, & Reimerink, 2018). It is available for access and query in the corpus query system Sketch Engine.
The students were introduced to the basic features of Sketch Engine analysis tool and pre-taught to use it. The project lasted 16 weeks: 4 weeks of introduction and guided practice, 10 weeks of independent practice with on-site follow-up, and 2 weeks of evaluation.

Corpora for ESP
In an ESP class a specialized corpus can serve as a unique tool for overcoming existing asymmetries in terminological systems of source and target languages.
The existence of such asymmetries often draws on extra-linguistic factors, e.g. numerous nature conservation technologies are not yet implemented in Russia which is directly reflected in learners' source language. Prospective specialized translators need to patch numerous lexical gaps by creating new terms in the L1. In this sense it is essential for language for specific purposes instructors to provide novice specialists with reliable tools and means that would facilitate creation of precise, accurate, and non-ambiguous terms. Therefore, LSP learners need specific corpus tool training to be able to make informed linguistic choices in future.

Learners' perception of DDL experience
Descriptive statistics provided preliminary results on the perception of corpus tools by the learners. Figure 1 is a histogram of the distribution of learners' perception of the complexity level of corpus-based assignments. The perception survey was administered to gather informal feedback on the project at its preliminary stage to explore principal possibility of corpus-based activities for non-linguist students.
The results are provided here to illustrate the outcome of the project; however, the author intends to address the perception issue in more detail in future research.
The majority of respondents (N=19) considered lexicographic assignments quite complex, the second biggest category (N=13) considered the assignments understandable, few learners considered corpus tools extremely complex (N=5) or impossible to comprehend (N=2).

Figure 1. Complexity level
However, the majority of the participants (85%) acknowledged that corpus tools were helpful for terminology disambiguation. Among their comments were: "truly illuminating", "like a linguistic detective", "seems reliable reference source", and "sometimes might be useful". The remaining 15% were overall reluctant to master corpus tools commenting as "why do we need to do this at all", and even "holy mother of god, get me out of this".

Limitations and possible solutions
The survey also asked Do you see any challenges in using a specialized corpus?
The answers can be subdivided into three respective categories. First of all, insufficient language proficiency might be a considerable pitfall; exposure to authentic professional contexts can be discouraging for lower level students. The solution here might be to design instructor-guided activities, simplify and tailor tasks. Secondly, non-linguist students in general demonstrate less interest in lexicographic discoveries. Therefore, it might be a good idea to introduce corpus tools gradually and only when other reference sources are of no help. Thirdly, however user-friendly corpus query interface is, learners find it challenging. To address this issue the instructor needs to pre-teach and guide search activities.

Conclusions
Corpus tools have immense potential for providing precise, accurate, and nonambiguous data on specific terminology in professional contexts. Increasing availability of specialized corpora holds great promise of new advances for ESP learners, shifting the pedagogic focus from prescribed vocabulary lists to inductive learning and learner autonomy. Overall, the learners demonstrated positive attitude toward hands-on corpus-based experience. Potential limitations of the approach, such as insufficient language proficiency, low motivation, and complexity of user interface can be remedied by thoughtful pedagogic design. It might be of interest for further research to develop a systematic approach to overcoming terminological asymmetries of source and target professional language by means of corpus tools.