SimpleApprenant: a platform to improve French L2 learners’ knowledge of multiword expressions

We present SimpleApprenant, a platform aiming to improve French L2 learners’ knowledge of Multi Word Expressions (MWEs). SimpleApprenant integrates an MWE database annotated with the Common European Framework of Reference for languages (CEFR) level and several Natural Language Processing (NLP) tools: a spelling checker, a parser, and a set of transformation rules. NLP tools and resources are used to build training and writing exercises to improve MWE knowledge and writing skills of French L2 learners. We present the user scenarios, the platform’s architecture, as well as the preliminary evaluation of its NLP tools.


Introduction
MWE knowledge improves proficiency in writing (Paquot, 2018). Given that language learners have difficulties using MWEs, characterized by strong lexical preferences, syntactic constraints and non-compositional sense (Baldwin & Kim, 2010), learners should use them in the right context and should apply the correct morphosyntactic constraints. For instance, jeter l'éponge 'to abandon', is the figurative sense and the determiner is singular, while jeter les éponges 'throw the sponges', is used with the real sense. Collocations have strong lexical preferences (poser une question 'put a question', but not *demander une question 'ask a question'). Word-for-word MWE translation generally fails (passer l'arme à gauche 'to kick the bucket').
Existing online platforms for L2 learners (Language Muse, WritingMentor) provide few lessons dealing with English MWEs. For French, projects such as Base Lexicale du Français (Verlinde, Binon, & Bertels, 2008) or DIRE Autrement (Hamel, 2010) represent MWEs' morphosyntactic and semantic features or collocation usage (Schneider & Graën, 2018). However, these resources do not propose graded CEFR exercises, except for a few websites (e.g. Bonjour de France, Le Point du FLE). To fill this gap, SimpleApprenant proposes graded exercises, annotated with CEFR levels, for MWE learning. The learner's level helps to select adequate content from the SimpleApprenant's database. Moreover, the platform provides immediate feedback by automatic error correction.  (Naber, 2003), and the parser Mind the Gap (Coavoux & Crabbé, 2017). We also developed a set of transformation rules, requiring parsed text as input, previously checked by LanguageTool.

The SimpleApprenant platform and its scenarios
SimpleApprenant proposes three scenarios for learners, who freely register on the platform by indicating their CEFR level. In the first scenario, the learner should match MWEs (compatible with learners' CEFR level) with the appropriate definition or gap-filling phrase. Thus, the learner learns MWEs' definition and usage, by repeating these exercises and with positive and negative feedback (Figure 1 below).
In the second scenario, the learner writes an essay, using at least one expression from a list of MWEs, labeled with the learner's CEFR level. The teacher evaluates the essays and gives a manual feedback about the MWE usage in context. These exercises might be repeated by the learner, with new MWEs.

Figure 1. The learner matches MWEs and their definitions (first scenario). A green message is printed if the correct answer is selected, otherwise the right answer is printed in red
The last scenario aims to improve learners' writing skills. The learner feeds the texts to the platform, which are then processed by LanguageTool, the spelling checker integrated into SimpleApprenant. If necessary, the learner corrects the spelling errors identified by LanguageTool. Then, the corrected texts are parsed by Mind the Gap. If required, the learners apply one of the transformation rules on their parsed texts and receive the transformed text. This feedback should help the learners to avoid some grammar errors.
SimpleApprenant is currently used by French language learners and their teachers from Opole University (Poland) (A1-C1 levels) and the University of Cyprus (A1-A2 levels). We have several CEFR levels, comparable target publics (native speakers of Polish or of Greek), and the possibility to follow the same students for several years. The teachers and students use the platform during classes as an additional resource, but also at home, mainly for MWE learning or for collecting written essays. The platform is used gradually from the first (A1-A2) to the third scenario (B2-C1), according to learners' CEFR levels.
We built the MWE database from Lexique-Grammaire (Gross, 1994) and from French vocabularies (Beacco, Bouquet, & Porquier, 2004). An MWE entry contains lemma, category (idiom, collocation), definition, gap-filling phrases (extracted from French Wiktionnaire), syntactic patterns, and CEFR level ( Table 1). The CEFR level is automatically identified from a graded textbook corpus (Todirascu, Cargill, & François, 2019) or manually assigned from reference textbooks (Beacco et al., 2004;Gonzalez Rey, 2007). SimpleApprenant uses LanguageTool to detect spelling errors and Mind the Gap to create a dependency analysis of the corrected texts. The learner is asked to correct the spelling errors detected by LanguageTool. Then, the texts are parsed and the learner applies one of the transformation rules implemented in SimpleApprenant. Six deletion rules suppress adverbs and relative or participial clauses. Thirteen correction rules handle common mistakes such as verb agreement, determiner agreement, or negation errors. Complex transformation rules include passive to active voice or cleaved sentences transformed into a subject verb order structure. After the rule is applied, the learner consults the transformed text (Figure 2 below).
The transformation of learners' texts is a challenging task, due to erroneous input. Mind the Gap is a state-of-art French dependency parser: for unlabeled dependencies. For instance, the best F1 (harmonic mean of precision and recall measures) is 95.53% for reference data (Coavoux & Crabbé, 2017) but only 83.12% for learners' essays (obtained for 100 phrases of our corpus). We evaluated the transformation rules on 273 parsed sentences. Fifty-five (20.15%) were either not transformed or contained errors in the output. Out of those 55 sentences, 34 sentences (61.82%) did not show any change and 21 sentences (38.18%) were transformed but contained errors. Deletion (82.71%) and correction rules (74.72%) are more effective than transformation rules. The rules failed because of parsing errors (due to erroneous learners' input) or syntactic patterns of the rule not matching the sentence. Even if some rules failed, agreement or negation errors are handled properly by the deletion and correction rules. As such, learners may still have feedback from these rules and see how their own text is transformed. Figure 2. The learners apply the rule adding a second negation particle pas to the original dependency tree for Je n' ai oublié de le demander 'I do not forget to ask it', becoming Je n' ai pas oublié de le mentionner dans mes messages

Conclusion and further work
We present an online platform for French L2 acquisition, SimpleApprenant, including NLP tools supporting reformulation strategies. A large MWE database, annotated with CEFR level, is used to create exercises focusing on MWEs. The exercises are selected according to learners' CEFR levels, generated with the help of preprocessing NLP tools: a spelling checker and a parser. The evaluation of transformation rules shows that some of them should be improved before being used by teachers and learners. We are currently revising the rules to improve the system's feedback. The evaluation of the platform via Web questionnaires started with beginner and intermediate learners (A1-A2) and by teachers. The questionnaires ask the learners to classify the exercises by their difficulties and usefulness. The evaluation is still in progress and will be extended to higher learners' levels (B2-C1).

5.
Disclaimer: Research-publishing.net does not take any responsibility for the content of the pages written by the authors of this book. The authors have recognised that the work described was not published before, or that it was not under consideration for publication elsewhere. While the information in this book is believed to be true and accurate on the date of its going to press, neither the editorial team nor the publisher can accept any legal responsibility for any errors or omissions. The publisher makes no warranty, expressed or implied, with respect to the material contained herein. While Researchpublishing.net is committed to publishing works of integrity, the words are the authors' alone.
Trademark notice: product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe.
Copyrighted material: every effort has been made by the editorial team to trace copyright holders and to obtain their permission for the use of copyrighted material in this book. In the event of errors or omissions, please notify the publisher of any corrections that will need to be incorporated in future editions of this book.