CALL replication studies: getting to grips with complexity

Calls for replication studies are becoming more frequent, and Computer Assisted Language Learning (CALL) has now reached sufficient maturity to offer numerous studies that lend themselves to replication. Realistic and successful replications rely on transparency in terms of data, results, and methodology. Two published studies in the area of vocabulary CALL will be discussed from the perspective of their suitability for replication: Franciosi, Yagi, Tomoshige, and Ye (2016) and Kim and Kim (2012). Alzahrani (2017) is a replication of Franciosi et al. (2016), confirming the findings with a markedly different learner group. The replication used the same methodology, a slightly modified list of target words, and Saudi participants. Kim and Kim (2012) compared vocabulary learning across three different screen sizes. The flashcard software is not specified any further, nor are the target words. While such an underspecified methodology is less likely to lead to a successful replication that can strengthen the validity and reliability of research results in our field, it can still provide a good training opportunity for students to learn about methodology in CALL.


Introduction
In Applied Linguistics, calls for replication studies are becoming more frequent (Marsden, Morgan-Short, Thompson, & Abugaber, 2018;Plonsky, 2015;Porte & McManus, 2018;Smith & Schulze, 2013). In the field of CALL, approximate replications crossing the boundary from 'traditional' (non-CALL) second language acquisition studies into CALL have been criticised as being problematic to some extent (Chun, 2012), but the field of CALL has now reached sufficient maturity to 1. Swansea University, Swansea, Wales, United Kingdom; c.tschichold@swansea.ac.uk; https://orcid.org/0000-0001-8487-2209 How to cite this article: Tschichold, C. (2019). CALL replication studies: getting to grips with complexity. In F. Meunier, J. Van  Apart from the benefits to the research field in terms of increased reliability and generalizability of findings, replication studies also offer an excellent opportunity for students and young researchers to conduct their first independent piece of research. At Swansea University, the doctoral programme has included a replication study done by the student in their first year for some time now. Using replication studies in Master's and undergraduate programmes is a more recent development. For this level, smaller studies that do not require data collections lasting for more than a few weeks could be suitable for replication. Replicating such a study gives students the opportunity to learn about different research methods, how to critically review the literature in the field, what types of methodology and statistical analysis is appropriate for their study, and it will also clearly demonstrate the difficulty of drawing conclusions from the often limited amount of data. If we expect future language teachers to engage with the research findings in their field in order to improve their own teaching practice, having had the experience to conduct a small study themselves can prove very beneficial for their understanding of published research in CALL. Here, I compare two very different replication studies done by students as part of their Bachelor of Arts (BA) or Master of Arts (MA) studies.

Two examples of replications
Realistic and successful replications rely on transparency in terms of data, results, and methodology. While a clear description of results is usually assumed to be a prerequisite for publication, the methodology, and also the data can be somewhat underspecified, a fact that becomes very noticeable when a study is considered for replication. Two published studies in the area of vocabulary CALL will be discussed in this light: Franciosi et al. (2016) and Kim and Kim (2012).

A transparent study replicated
Alzahrani (2017) is a replication of Franciosi et al. (2016), confirming the findings with a markedly different learner group. Franciosi et al. (2016) compared the short-and long-term word gains after a session of playing the simulation game Third World Farmer (in addition to practising the 29 target words using Quizlet) to the gains after using only Quizlet, where the total time on task remained the same for both groups. The learners (n=162) were Japanese university students. The replication used the same methodology, a slightly modified list of target words, and younger, female, Saudi participants (n=196). A pre-test of the vocabulary level of the learners was added to the methodology. This study found much lower word gains than Franciosi et al. (2016), but a similar difference between the experimental and the control group. As mentioned in Tschichold and Alzahrani (2018), despite the lower "rate of vocabulary retention among the Saudi learners, we can [safely] conclude that the results broadly support the findings [of the original study, i.e.
using games] in [English as a foreign language] classrooms is beneficial for vocabulary acquisition" (p. 339).
This year, two more MA students have replicated this study. In addition to keeping the pre-test as introduced by Alzahrani (2017), they have also introduced a third group. In addition to the original two groups (one group using Quizlet only, the other Quizlet for half the time, and the game Third World Farmer for the other half), the third group played the game for the entire time and did not spend any time using Quizlet vocabulary flashcards. All three groups thus represent CALL conditions, but one moves away from the intentional word learning into purely contextual, incidental vocabulary acquisition. Whether the data will still show significant levels of difference between the groups remains to be seen, especially as the group sizes are smaller than in the original study (results from these two studies were not available at the time of writing).

A less successful replication
The second study chosen for replication is Kim and Kim (2012). The authors compared vocabulary learning across three different screen sizes (iPod, smartphone, and Kindle size), using a sample of 135 Korean English as a second language students. The learners' task was to learn 30 words, with or without pictorial annotations. The "web-based self-instruction programme" (Kim & Kim, 2012, p. 65) used for this purpose is not specified any further, nor are the target words. This provides the opportunity for the students to choose the learning materials and the software for a (very approximate) replication. The group of undergraduate students tasked with this topic for their final assignment chose and piloted a list of academic words in order to be able to test both English native speakers and second language students as subjects. As screen sizes have evolved since Kim and Kim's (2012) study, the number of screen sizes to compare was reduced to just two, essentially a PC screen and a smartphone screen. In order to further reduce the complexity, the pictorial annotations were also dropped, as these would have been difficult to find for the relatively abstract words used in the replication. A total of 70 participants took part in the experiment, randomly divided between the two screen conditions. The trend in the results could be seen to confirm the superiority of the larger screen for vocabulary learning, but the differences did not reach significance. Given such an underspecified methodology in the original paper, the replication is unlikely to strengthen the validity and reliability of the findings. However, what this kind of very approximate replication can provide is a good awareness by the student researchers of the issues in the field.

Conclusions
Given our positive experiences with replications in the Ph.D. programme, we were interested in seeing how well replications would work in the MA and BA programmes. The aim of these replications by student researchers was not so much the strengthening of the validity of earlier findings, but the training in research methods this task would provide. With the publication of Porte and McManus (2018), this training task has now become more straightforward. A number of challenges do remain, not least a certain reluctance among students to do a replication for their thesis, as they are concerned about the originality of their work. With more replication studies being published, this particular point should become easier to address in the future.

4.
Disclaimer: Research-publishing.net does not take any responsibility for the content of the pages written by the authors of this book. The authors have recognised that the work described was not published before, or that it was not under consideration for publication elsewhere. While the information in this book is believed to be true and accurate on the date of its going to press, neither the editorial team nor the publisher can accept any legal responsibility for any errors or omissions. The publisher makes no warranty, expressed or implied, with respect to the material contained herein. While Researchpublishing.net is committed to publishing works of integrity, the words are the authors' alone.
Trademark notice: product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe.
Copyrighted material: every effort has been made by the editorial team to trace copyright holders and to obtain their permission for the use of copyrighted material in this book. In the event of errors or omissions, please notify the publisher of any corrections that will need to be incorporated in future editions of this book.