Laying the groundwork for a historical overview of high-impact CALL papers

This study traces the evolution of Computer Assisted Language Learning (CALL) through published Research Articles (RAs) in four major journals: ReCALL, CALL, Language Learning & Technology (LL&T), and CALICO Journal. The paper outlines the rationale and the methodology of this, which begins with downloading all 2,397 full RAs published in English, from the very first issues up to the end of 2019. This preliminary report already gives an overview of the history of the field; in particular, the increasing number of papers attests to the healthy state of research in CALL. Subsequent analysis focuses on a subcorpus of 426 papers chosen by the frequency of citation in each year as a gauge of impact within the community. The final analysis will use computer tools to help identify methodologies, themes, and theories as they rise and fall over the years.


Introduction
CALL was sufficiently active in the 20th century to warrant its own journals, beginning with System (https://www.journals.elsevier.com/system) in 1973, CALICO Journal (https://journals.equinoxpub.com/index.php/CALICO) in 1983, ReCALL (https://www.cambridge.org/core/journals/recall) in 1989, CALL (https://www.tandfonline.com/toc/ncal20/current) in 1990, and LL&T (https:// www.lltjournal.org/) in 1997. These have since been joined by numerous others, e.g. JALTCALL, in print or online, often specific to particular sub-fields or targeting region-specific authors and readers, not to mention the vast quantities published in other more general journals in applied linguistics or education, as well as in books and chapters, conference proceedings, and doctoral dissertations. Given this long and diverse history, the question we would like to ask is how researchers gain an overview in the major trends in CALL over time. This study outlines the initial stages of selecting high-impact publications, the methodology for analysing them, and their initial findings. The ultimate goal of this narrative review is to code the papers, and then use various computer tools to assist with identifying the stages, subdivisions, and progression of CALL-related research over time. The result should provide a roadmap, revealing what we know today and how we arrived there, and, in light of this, suggest avenues for future research (Zhao, 2003).

Method
Given the vast literature on CALL, the first stage was to limit the scope of this review, and focus on the most impactful research publications. As a rule of thumb, journals are often considered to be among the most prestigious sources (cf. Lei & Liu, 2019); furthermore, each has at least one issue per year which provides useful continuity in this field, which Stockwell (2007) sees as 'highly technical' due to its rapid evolution. Journals also tend to provide first-hand empirical data, which may draw on years of work. Scimago Journal and Country Rank (SJR) and Journal Citation Reports (JCR) were then analysed to identify the journals with the highest impact, namely ReCALL, CALL, LL&T and CALICO Journal, all of which rank in the top 100. The choice corresponds to other recent syntheses, such as Gillespie (2020), who based his synthesis on ReCALL, CALL, and CALICO Journal, with mention of LL&T.
All the RAs from those four journals were downloaded and sorted chronologically; other published sections, like book and software reviews, reports, editorials, and commentaries were excluded, as were a dozen papers in languages other than English. The initial pool of data thus consists of a corpus of 2,397 RAs published in English, from the very first issues up until the end of 2019, the last full year prior to collection.
Manual analysis of the entire corpus of 2,397 RAs being impractical for present purposes, further choices were made to reduce the sample. Various options were considered: a random selection of papers from each journal in each year, or every nth year or issue, each of which risked missing out on highly influential papers. Instead, we opted for citation as a measure of impact in the field. By running a thorough search of Google Scholar, citations of all 2,397 RAs within our corpus of four CALL journals were recorded. We then chose the 15% most widely-cited papers each year as a useful cut-off point, with papers receiving the same numbers of citations all being included to avoid forced choices with their inevitable degrees of subjectivity. This means that the papers published in a given year are only in competition with each other, thus reducing bias between years. In the end, this gave us a subcorpus of 426 papers which can be shown empirically to have produced major influence in the field. This is substantially larger than most reviews, some of which cover only a handful of studies (see Plonsky & Ziegler, 2016).

Results and discussion
The analysis is ongoing and far from complete, but some features are already becoming apparent (Figure 1, left). In terms of the entire pool of 2,397 RAs, CALICO Journal has published the most, with 788 RAs (33% of the corpus), followed by CALL (776)  From the overall corpus, 555 RAs are never cited in other papers in these journals (23%), and a further 407 are cited once only (17%); overall, less than 16% (386) have been cited 10 times or more here. Conversely, the three most cited papers (see supplementary materials) are Warschauer (1995) in CALICO Journal, with 133 citations, followed by Chapelle (1998) and Blake (2000). The dates are revealing in that early papers, with longer post-publication periods in which to be cited, inevitably top the list. However, it should be remembered that retaining the top 15% from each year of publication avoids such bias overall. This gives weight to initiatives, such as DORA 4 , which underline how a journal's ranking is not a reliable indicator for individual authors or papers.
The table (in supplementary materials) also highlights the time taken for papers to be cited. Warschauer, for example, was not cited in the year following publication, once in each of the next three years, and four times in the fifth year. In other words, 126 out of 133 citations (95%) occurred more than five years after publication. This corresponds to Park (2012), whose analysis of citations in CALL journals found that 77% were five years old or more, and 22% were at least 15 years old. Typically, the citation half-life in applied linguistics (i.e., the year dividing the references in a given article into two equal halves, older and more recent) is over ten years. The whole list of ten highly-cited RAs shown in supplementary materials could be particularly ironic in a technological field such as CALL, though (to put a generous spin on things) it may suggest that researchers are less interested in the fast-moving technologies than in the procedures and activities involved.

Conclusions
This paper has summarised the methodology and initial results from a historical overview of high-impact RAs in CALL to show that CALL research is generally in a healthy and expanding state, though the ages of citations are potentially worrying. The current phase involves reading and manually coding all papers by two researchers; a first batch has already been conducted and divergences solved, while a second batch is under way, and will be subjected to inter-rater analysis using Cohen's (1988) kappa. The categories are operationally defined, in what Riazi, Shi, and Haggerty (2018) call a "data-driven thematic approach" (p. 44). Nonetheless, such coding sheets have their limitations, so they will be complemented by computer assisted discourse analysis featuring tools such as NVivo and AntConc to identify salient features and patterns. Together, the analyses should allow us to extract information about the research context (country, setting, programmes, and learning environment), the research participants (status, age group, proficiency, L1, and L2), as well as methodological and especially theoretical considerations (research methodology, research focus, and research theory). From there, the goal is to narrate the history of CALL and portray its development after four decades of presence.