EFL learners and English email writing: developing a computerised diagnostic language assessment

. Email remains a key mode of communication between faculty and students in higher education institutions. Composing appropriate email texts is an important skill for learners; however, little technological support is available for the pragmatic aspect of email communication – the ways in which social context influences language choices. Furthermore, pragmatics can be undertaught in the language classroom. One approach to providing support for learners while also addressing the issue of giving instruction to large class sizes is via computerisation. In this ongoing research project, we describe the development of a Computerised Diagnostic Language Assessment (C-DLA) of L2 English email writing for Japanese English as a Foreign Language (EFL) learners in Japanese higher education. The C-DLA provides automated feedback to learners on the pragmatic aspects of their draft email texts, with feedback adapting to learners’ success in resolving identified issues. We report on the development phases of the project, challenges encountered, and implications for further research.


Introduction
Email remains a key mode of communication between faculty and students in higher education institutions. In writing L2 English emails, students have access to a number of online tools that can help with the formal aspects of writing, such as grammar and spelling. There is little support, however, for the pragmatic aspect of email communication -the ways in which the surrounding social context of the communication influences our language choices. Furthermore, in comparison with other elements of language learning, pragmatics is often undertaught in the language classroom (McConachy & Hata, 2013;Taguchi & Roever, 2017). In the university English L2 learning context, there is also often the need to provide efficient -but still effective -feedback to large classes of learners, with each learner having their own particular strengths and weaknesses. The need to overcome these challenges is an important one, as failure to adhere to L2 community pragmatic norms in email communications may lead to negative consequences for learners and their relationships with English L1 faculty (Economidou-Kogetsidis, 2011, 2016Savic, 2018). One approach to addressing these classroom challenges is via computerisation -allowing large classes of students to simultaneously receive individualised feedback.
In this study, we describe an ongoing work-in-progress research project: the development of a C-DLA of L2 English email writing for Japanese EFL learners at a Japanese computer science university. The purpose here is to describe the process of development to date and the challenges encountered. The purpose of the C-DLA is twofold: to simultaneously evaluate learner performance while promoting learning via automated, individualised feedback. This follows a dynamic assessment-type approach, in which learner potential is assessed via the principled, systematic use of mediation (Poehner, 2008). The C-DLA development comprises two main phases: corpus development and software development.

Method
In the initial corpus phase, approximately 1,300 English L2 learner emails were collected and annotated. Using Google Forms, email text data were elicited via the administration of a set of four tasks, based on the real-world needs of the learners, with varying social contexts. Task scenarios were based on the results of a questionnaire administered to a sample of the target student population, eliciting examples of common requesting scenarios in their academic and daily lives. High frequency scenarios were assigned + or -values for three variables: power (P), social distance (D) and imposition (R), based on Brown and Levinson (1987; see Table 1 for definitions). Text data were then imported to WebAnno, a web-based platform, and manually annotated for perceived instances of pragmatic failure by expert English users, using a tailor-made coding scheme. The coding scheme was adapted from the CCSARP project (Blum-Kulka & Olshtain, 1984) and Economidou-Kogetsidis (2016) for use with email data specifically, and for identifying pragmatic failure, rather than pragmatic features. -Receiver has a lower rank, title, or social position, e.g. salesperson serving a customer.
-Sender or receiver know, or identify with, each other.
-Small expenditure of energy by the receiver to carry out the request.
In the software development phase, a basic C-DLA prototype was created and tested. A user-friendly graphical user interface was created to enable learners to use the C-DLA without the need for detailed explanations. The C-DLA administers four email tasks for each assessment. For each task, students draft and submit an email in the input field. Instances of pragmatic failure in the student-created text are identified via rules-based parsing and string searching. Feedback messages are automatically displayed on detection of any instances of pragmatic failure. Learners then redraft their email, receiving feedback up to four times for each task. Feedback messages for each repeated specific instance of failure increase in explicitness. After all four tasks are completed, a summary report is generated for each learner, showing frequent instances of pragmatic failure across all four tasks, and the amount of feedback they required in order to complete the tasks successfully. Currently, the system is in the testing phase, being iterated and improved upon by the researchers, before beginning a piloting process with learner participants.

Discussion and conclusion
One challenge in developing the C-DLA relates to the corpus data collection. For the C-DLA to be useful, it was necessary to implement rules for identification of specific instances of pragmatic failure. The primary purpose of the corpus phase, therefore, was to analyse the data and identify patterns within them that could inform rule creation and enhance the accuracy of the C-DLA system. In this way, we sought to avoid reliance on researcher intuition. Typically, corpus data is authentic; however, authentic email data collection can be challenging, leading to the decision to elicit data via classroom-based tasks. A key benefit of this approach was the ability to control the contextual variables of the email task scenarios, as well as the ability to collect the data efficiently and systematically. Relatedly, automatic annotation of pragmatics-focused corpora is an ongoing challenge for researchers; therefore, the corpus was manually annotated -a timeintensive process.
An additional challenge in the software programming phase relates to automatic head act detection. The potential for there to be multiple request-type phrases in a given email text is problematic in terms of accurately identifying the head act of the request act. In response to this issue, the rules incorporated into the software are undergoing regular iteration and adjusting in response to researcher and pilot participant feedback with the aim of increasing accuracy. The fact that the texts are written by EFL learners increases the challenge of this task. With learners not necessarily aware of L2 pragmatic norms in email communication, head act placement within the body of an email may be less predictable than with an L1 English communicator.
As a work-in-progress, the next steps of the project involve further piloting with small groups of learner participants, with the aim of C-DLA administration to larger groups in classroom settings. This will allow for evaluation of the effectiveness of the feedback approach adopted, tracking learners across time as they engage with the C-DLA on multiple occasions.