Japanese EFL undergraduate students’ use of the epistemic modal verbs may, might, and could in academic writing

Willem B. Hollmann; Kazuko Fujimoto; Masahiro Kuroda

doi:10.1515/cercles-2023-0014

Open Access Published by De Gruyter Mouton May 8, 2024

Japanese EFL undergraduate students’ use of the epistemic modal verbs may, might, and could in academic writing

Willem B. Hollmann , Kazuko Fujimoto and Masahiro Kuroda

From the journal Language Learning in Higher Education

https://doi.org/10.1515/cercles-2023-0014

Abstract

Modifying and hedging one’s claims appropriately is an important characteristic of academic writing. This study focuses on the three main English modal verbs used to express “epistemic possibility” to avoid making strong statements, viz., may, might, and could. The purpose of this corpus-based study is to explore modal verb usage by Japanese EFL undergraduate students and consider pedagogical implications of our findings. Our analysis suggests that the Japanese students’ use of these modal verbs, especially could, has a tendency to be informal and insufficiently academic. While the Japanese students use could very frequently, they do not use it sufficiently in the sense of “epistemic possibility”, and some of their use is inappropriate not just in academic English but in English more generally. The observed high frequency of could may be related to topics and may also be due to the influence of L1. We discuss different factors that may explain the findings, based mainly on the overview of factors impacting on EFL learners’ use of academic English offered by Gilquin and Paquot (2008). Too chatty: Learner academic writing and register variation. English Text Construction 1(1). 41–61), suggest several additions to this overview, and discuss implications for the instruction of these modal verbs in academic writing and in order to improve relevant teaching materials.

Keywords: academic writing; corpus-based study; epistemic modality; Japanese EFL learners; modal verbs

1 Introduction

Modal meanings include “epistemic possibility”, which we define here following Downing (2015: 343) as those that “assess the possibility, probability or otherwise of a state of affairs, according to the speaker’s limited knowledge or belief”. Numerous grammarians argue that modal verbs for epistemic modal concepts serve an important function in academic writing to avoid the impression of too much certainty and soften assertions (e.g., Becker and Feng 2020; Depraetere and Langford 2020; Hinkel 2020; Leech et al. 2009; Poole et al. 2019). Many grammarians also agree that modal verbs are difficult for ESL/EFL students to master due to their polysemy and semantic overlap (e.g., Larsen-Freeman and Celce-Murcia 2016).

General observations about the importance of epistemic modality in academic writing, difficulties this area may present for learners, and the lack of research specifically on the Japanese context motivated this corpus-based study on the three main modal verbs which are used to express “epistemic possibility”: may, might, and could.^[1]

There is a substantial amount of scholarship on the nature of academic writing more generally by English learners with different L1 backgrounds that points to learners using language that is more reminiscent of informal and/or spoken language (e.g., Aijmer 2002; Altenberg 1997; Breeze 2008; Gilquin and Granger 2015; Gilquin and Paquot 2008; Granger and Rayson 1998; Leedham 2011; McCrostie 2008; Özhan 2012; Petch-Tyson 1998; Tåqvist 2018; Virtanen 1998). Our hypothesis, based on the preponderance of studies that suggest that learners tend toward insufficiently formal use in their academic writing, is that this may apply to Japanese students’ use of modals as well. Most learner corpus studies have analyzed advanced learner data but Gilquin and Granger (2015: 419) have encouraged looking at learners at other levels. Compared to the CEFR levels of learners in many European countries, Japanese learners’ levels tend to be significantly lower, with the majority at A2 and B1 (Ishikawa 2015: 104; Tono 2020: 25). Regarding learners’ writing as a target of corpus research, scholars usually analyze discipline-specific writing rather than general essays or reflective writing (Aull 2019: 143), and little attention has been paid to registers other than reviews, theses and dissertations (Becker and Feng 2020: 257).

This study analyzes paragraph writing assignments in academic writing courses using a corpus compiled of data from Japanese undergraduates who fall roughly into the CEFR A2 and B1 levels. By focusing on the use of some modal verbs in academic paragraph writing (which has been studied less than other text types) by lower-level learners (who are still under-researched), we aim to contribute to understanding how these learners use modal verbs and what instruction may be helpful to them. This study will also use an American English (AmE) academic prose corpus. As we discuss in Section 3.1, we do not assume that the students see professional writers’ prose as their target. However, this corpus will be useful to gain insight into authentic usage of the three modal verbs. The reason for choosing AmE as our reference point is that English teaching in Japan is mostly oriented toward this variety. Consider for instance that in the Japan Exchange and Teaching Programme, 55.22 % of Assistant Language Teachers in 2022 were US nationals (JET 2022). Most English textbooks and English-Japanese dictionaries in Japan also privilege AmE, for example, with regard to spelling and pronunciation.

Based on our corpus findings and considering the risk of negative perceptions of English learners’ inability to use modal verbs appropriately, we argue that certainly in the Japanese context more attention needs to be paid to classroom instruction on these modals, especially could. We will discuss possible reasons as to why Japanese learners might deviate from AmE professional writers’ usage of may, might, and could in the ways they do, and suggest possible pedagogical implications for EAP instruction and teaching materials about these modals.

2 Previous studies of learners’ use of modal verbs

Aijmer (2002) compared Swedish, French, and German learners’ use of English in the International Corpus of Learner English (ICLE), which consists of argumentative essays by advanced proficiency level undergraduate students, with that of British university native speaker students in the Louvain Corpus of Native English Essays (LOCNESS).^[2] She reports French learners highly significantly overused may compared to native speaker students, but German and Swedish students did not show that pattern. This may suggest that the use of modal verbs varies depending on the learners’ first language. Aijmer also reports significant overuse of might across all three groups. As one of the possible reasons for this she mentions the influence of speech: might is more typical of conversation than of academic writing (cf. Holmes 1988: 29).

Aull and Lancaster (2014) analyzed first-year university students’ argumentative essays in a large public research university and a smaller, private liberal arts university in the US, and found that in the essays from the latter university modal verbs were used less frequently as hedges than approximative hedges including adjectives and adverbs (e.g., apparent, generally). They suggest that in order to expand the repertoire of stance features, such as hedges, students should be exposed to the characteristics of university and/or disciplinary writing (2014: 160).

Becker and Feng (2020) analyzed the use of modal verbs in the Physical Sciences subcorpus of the Michigan Corpus of Upper-Level Student Papers (MICUSP).^[3] This subcorpus was compiled from A-graded course assignment papers from final-year undergraduates through third-year graduates. Unlike graduate students, undergraduates used fewer possibility modals (including may, might, and could), which soften assertions, than prediction modals (e.g., will, would), which work in reverse. In this regard, Becker and Feng’s (2020) results confirm the findings of previous studies, that it is difficult for learners to adequately express the degree of certainty and that learners, without sufficient academic writing experience, tend to use boosters rather than hedges (e.g., Aull 2015, 2019; Hyland 2016).

Thus, there are studies analyzing the use of modal verbs at different learning stages. However, there is little research on the use of English by lower-level learners at the A2 or B1 CEFR levels, and still fewer studies have comprehensively discussed the factors that contribute to learners’ use of modal verbs.

Our research questions are as follows:

Are there any similarities and differences in the use of the modal verbs may, might, and could between Japanese learners of English and AmE expert writers?
If there are any differences, how might we explain them, and what might the pedagogical implications be?

3 Methodology

3.1 Corpora

In this study, we will mainly use two corpora: a Japanese undergraduate students’ academic writing corpus and an AmE academic prose corpus. Some studies have questioned the appropriateness of comparing corpora of non-native and native speakers, the argument being that non-native speakers may not (or perhaps should not) use native speaker language as their target (e.g., Hunston 2002: 211).^[4] Another possible question is around the comparability of corpora that differ in data collection methods and content, including text types but also proficiency level or learning stage (e.g., Friginal 2018; Rapp et al. 2022).^[5] On the other hand, there have been recent studies comparing learner corpora with existing expert writers’ corpora such as the Corpus of Contemporary American English (COCA)^[6] and the British component of the International Corpus of English (ICE-GB)^[7] (e.g., Aull 2019; Aull et al. 2017; Polio and Yoon 2021; Wulff and Gries 2021). Gilquin (2022: 91) offers a nuanced discussion of the arguments for and against using expert versus novice writers’ corpora as a reference corpus. She argues that for educational purposes expert writing is “a better model” and “a more ambitious target” for learners. Since one of the main purposes of this paper is to make pedagogical suggestions, expert writing is an appropriate reference point. We will use an AmE professional writing corpus not to present their usage as a necessary goal for the learners but to be able to describe authentic usage of modal verbs in academic writing. By referring to AmE professional writers’ usage, we aim to find how similar or different the not-advanced-level and non-genre-specific Japanese undergraduates’ use of the modal verbs is to/from that of AmE expert writers, and to suggest possible pedagogical implications for English language instruction and teaching materials about modal verbs in academic writing.

The learner corpus in this study is an 89,063-word corpus of written English by a total of 158 second-year Japanese EFL undergraduates who took an academic writing course of about one-year (i.e., two-semesters) in the Department of the English Language at the institution of one of the authors (Japanese Undergraduates English Corpus: JUEC). The students’ mean TOEIC score is 480.6 (Range: 250–755; Median: 473.1; SD: 117.4). The score range roughly falls within A2 to B1 on the CEFR.^[8] Because of the wide range of scores, the students are divided into five levels set by the university where the data were collected: Basic (Score Range: 0–280), Elementary (285–395), Intermediate (400–485), Upper Intermediate (490–620), and Advanced (over 625). Table 1 gives a description of JUEC. All the students used an academic writing textbook by Zemach and Islam (2005, 2011 ^[9] compiled to target the B1 level. The students were allowed to use dictionaries and set up a time limit for themselves to complete their writing assignments. They submitted their first draft about a topic provided in each of the 11 units (Units 2–12) in the textbook. The topics are a mixture of argumentative and narrative ones (see Appendix A). JUEC was left untagged since modal verbs are easy to find using lexical search. The data in JUEC were analyzed with the computer software AntConc (Anthony 2020), and all the examples on concordance lines were manually examined using spreadsheets.

Table 1:

Total number of students, texts, and words in JUEC.

Proficiency levels	Number of students	Number of texts	Number of words
Basic	6	21	2,207
Elementary	35	158	18,549
Intermediate	42	186	22,299
Upper intermediate	54	265	32,421
Advanced	21	111	13,587

Total	158	741	89,063

Our reference corpus is an AmE professional writers’ academic prose 185,506-word subcorpus of the American English 2006 corpus (AmE06_L) in the Brown family of corpora compiled at Lancaster University.^[10] AmE06_L consists of 80 texts in various academic fields published between 2004 and 2008, and each text is approximately 2,000 words in length. The size of AmE06_L allows us manual examinations of all examples of the three modals.

3.2 Meaning categories of modal verbs

The analysis of the meanings of the three modal verbs in this study will be based on semantic categories distinguished by Declerck (1991).

May in (1) and might and could in (2) describe “epistemic possibility”. Declerck mentions, in relation to (2), that could and might can be used as tentative alternatives to can and may:

(1)

We may go to Amsterdam next week./He may be right. (Declerck 1991: 397)

(2)

This might/could be a very important clue. (Declerck 1991: 399)

Needless to say, in classifying the modals in our corpus examples, careful consideration of the context played a crucial role. Both of the linguists on the authorial team discussed examples comprehensively, with the final categorization based on mutual agreement.

In analyzing modal verbs, only those in main clauses were considered. This was done so as to exclude past forms of the modals which potentially follow the rule of the sequence of tenses in the subordinate clause. Our search included negative forms of the modals but we subsume these under may, might, and could.

4 Results

Figure 1 shows the results of the corpus search of the three modal verbs with the “epistemic possibility” meaning, with normalized frequency (per 10,000 words). The difference in frequency of the modal verbs between the corpora was examined by log-likelihood tests (Rayson and Garside 2000). May with the “epistemic possibility” meaning seems to be used less frequently in JUEC than in AmE06_L, though the difference between the corpora is not significant. Might with this meaning appears more frequently in JUEC than in AmE06_L, but as is the case with may there is no significant difference. In JUEC, may is used with the meaning of “epistemic possibility” for 93.8 % (45 of 48) of all occurrences, and might is used with this meaning in 100.0 % of cases (26 of 26). The frequency of could is noteworthy. Could with the “epistemic possibility” meaning is used significantly less in JUEC than in AmE06_L (see Appendix B). Though the total frequency of could is much higher than that of may and might in JUEC, it is used with the “epistemic possibility” meaning in only 2.0 % of cases (3 of 148). This means that the Japanese students’ use of could merits careful attention, as it may be distinctive in some way(s).

Figure 1:

Frequency of may, might, and could with the “epistemic possibility” meaning in JUEC and AmE06_L.

The modal verbs in (3)–(8) are used with the “epistemic possibility” meaning in the two corpora:

(3)

There are other elements of the outer shell that involve lateral motions, buoy-ancy [sic], chemistry, mineralogy or conductivity and these may or may not be part of the lithosphere. (AmE06_J02)

(4)

We have to check some magazines or TV shows every day to see famous people and catch up with a trend. It may be hard but those who are keen to fashion don’t care about it. (Upper Intermediate)

(5)

Clearly, it would be extremely difficult to remove all doubt, but it might very well be possible to offer strong supporting evidence for or against it. (AmE06_J51)

(6)

In addition, I didn’t know the taboo about foods. However, if we don’t know about it, we might make a fatal mistake. So I insist that we should know about some taboos in main religions and avoid making fatal mistakes. (Upper Intermediate)

(7)

The marginal effect of an increase in income on the probability of using a more distant airport could be negative or positive. (AmE06_J73)

(8)

Under five-year-old children’s action is impossible to foresee, in addition they can do nothing when they are endangered, so that we must not take our eyes off them. The irresponsibility for children could lead to disrespect for human life. (Upper Intermediate)

Semantic analysis, based on Declerck (1991: 389–390), of Japanese students’ use of could shows that the “ability” meaning^[11] is significantly higher than in AE06_L. 89.19 % of all occurrences of could (132 out of 148 occurrences) in JUEC carry the “past ability” meaning, as in (9), with 11 tokens describing non-past ability (10):

(9)

Besides, recently, one of the employees broke her bone, so she could not work for one month at least. (Upper Intermediate)

(10)

First, it is dangerous to leave a child alone. If the child walked to kitchen and turned on the stove, nobody could not help the child. (Intermediate)

Let us now look at the Japanese students’ use of could with the “ability in the past” meaning according to their English proficiency levels. One might hypothesize that as students progress in their overall proficiency, their use of could should gradually approximate that of the AmE writers. However, as Figure 2 shows, the Japanese students’ usage in this respect actually seems to move away from AmE writers’ usage.

Figure 2:

Could with the “ability in the past” meaning in JUEC and AmE06_L (frequency per 10,000 words).

Biber et al.’s (2021: 487) corpus search findings show that could is much more frequent in conversation than in academic prose, which in this grammar includes both British and American data. In order to focus specifically on AmE we carried out a search of COCA^[12] and found that in these data the frequency of could is much higher in spoken than in academic texts, the difference being 1429.44 versus 935.32 per million words. The Japanese students’ highly frequent use of could therefore seems an instance of informal usage.

We will consider the learners’ use of the “epistemic possibility” meaning of the three modal verbs in more detail. It is worth looking at how the learners progress with respect to their usage of these verbs in this manner. This has been visualized in Figure 3. Once again, one might have expected that Japanese learners, as they progress in their studies, would gradually approximate AmE writers’ usage. However, that does not appear to be the case: it is not really possible to detect any trend toward the AmE reference corpus. We conclude from this that the current approach to teaching this important use of these modal verbs does not seem to lead to increased approximation of AmE writers’ usage. (We have acknowledged, above, that perfect approximation is or should not necessarily be the learners’ goal, but it is an interesting observation nonetheless.)

Figure 3:

May, might, and could with the “epistemic possibility” meaning in JUEC according to English proficiency levels (frequency per 1,000 words).

5 Discussion and pedagogical implications

On the basis of our corpus search findings, the frequencies of may and might with the “epistemic possibility” meaning are not significantly different between JUEC and AmE06_L. The overall frequency of these modals in JUEC may provide some reassurance that these epistemic markers are broadly used appropriately, yet the lack of a clear direction toward AmE writers’ usage (see Figure 3) nevertheless raises questions about the way in which these verbs are taught. We will return to this later in this section.

The main observation to make, based on the data, concerns could. The Japanese learners use this modal significantly less often as an epistemic marker and very highly significantly more often to describe “past ability”, as in example (9). Whilst this example is perfectly acceptable, JUEC also contains instances of a rather different kind of “past ability”:

(11)

In high school days, I could get a driving license. I like watching cars from my childhood, so I was happy to get a permission to drive a car by myself. (Upper Intermediate)

(12)

Then I heard a voice, and I looked around. It was my grandmother who died 10 years ago. Her face looked very happy and she is speaking a lot to me. Before she disappeared, I could say that I am doing well at university and living happily. Then she went away. After a moment, I remembered that that day was the anniversary of my grandmother’s death. (Advanced)

As Declerck (1991: 394) says, could “cannot be used to ASSERT the PERFORMANCE of a DYNAMIC situation on a SPECIFIC OCCASION in the past”. The difference between (9), on the one hand, and (11)–(12), on the other, relates to event structure: (9) describes an ability on the part of the subject (she), whereas (11)–(12) portray a specific event that the subject was not only able to do, but did in fact complete on one particular occasion. This use is not acceptable for could in American (or British) English, although it is for was/were able to and for the negated form could not. JUEC contains as many as 22 tokens like (11)–(12), which is a substantial subset of the total number of examples: 16.67 % of all past ability uses of could.

Having highlighted the main difference between JUEC and AmE06_L, we now consider what may underlie the Japanese learners’ deviations from the AmE writers’ usage of the modals, especially could. Gilquin and Paquot (2008), one of many studies to have identified the tendency for foreign learners to sound colloquial when attempting to produce English academic prose (see Section 1 for additional references), offer a useful overview of four factors that may explain this pattern: the influence of spoken language, L1 transfer, teaching-induced factors, and general lack of familiarity with academic writing.

Regarding the first factor, i.e., the influence of spoken language, Gilquin and Paquot (2008: 52) argue that in the context of EFL learning authentic spoken input, be it from teachers, teaching materials or exposure to English television and other media, mostly tends to be limited, and is therefore unlikely to play a very substantial role in explaining colloquial features in academic writing. They suggest that authentic input via mass media may be high in countries such as the Netherlands, but in Japan it is probably not the case.

L1 transfer, Gilquin and Paquot’s second factor, is probably significantly more important in the present study, especially in causing the learners’ use of (ungrammatical) “past performance on one occasion” could (see (11)–(12)). Japanese dekita can have the “past ability” meaning, like could, but it can also describe “past performance on one occasion”, unlike could; see the following examples from Nishitani and Nakazaki (2015; translation ours):

(13)

Itsu-demo	Tarō	Wa	50m	oyogu	koto	ga	deki-ta.
At.any.time	Tarō	SBJ	50m	swim	thing	NOM	can-pst
‘Taro could swim 50 meters at any time (i.e., whenever he wanted).’

(14)

Kinō	Tarō	Wa	50m	oyogu	koto	ga	deki-ta.
Yesterday	Tarō	SBJ	50m	swim	thing	NOM	can-pst
‘Taro was able to swim 50 meters yesterday.’
(Nishitani and Nakazaki 2015: 50)

The seemingly significant impact of L1 transfer leads us to an important pedagogical implication of our study. Modal expressions, including modal verbs, display significant crosslinguisic variation in terms of their meanings and uses (see for example Nuyts and van der Auwera 2016). As EAP textbook authors would struggle to cover this variation we would suggest that there is an onus on instructors to identify the main differences between the L1 and English in this regard. This should be done using descriptive grammars of the languages and/or studies specifically on modal verbs. In view of the well-known importance of “noticing” in second language learning (cf. Schmidt 1990, 2012) instructors should draw students’ attention to these differences, and continue to be mindful of and highlight them when marking and providing feedback on students’ work.

As for Gilquin and Paquot’s (2008) third factor, teaching-induced factors, the authors’ examples of this are all inadvertent “distort[ions of] learners’ sense of what is stylistically appropriate or not” 2008: 55). This factor clearly contributes to Japanese learners’ use of could. An interesting question in relation to this is: is there anything in the teaching method that causes the Japanese students’ usage of this modal? This is not an easy question to answer comprehensively as it would require a detailed study of the teaching and learning process in its entirety, including interactions between teachers and students in class, written feedback, and so on. However, for the present purposes, let us focus on printed teaching materials which the Japanese learners in this study used: Zemach and Islam (2005, 2011; for details see Section 3.1) and in particular its coverage of could, which shows the most substantial deviations in the Japanese learner data from the AmE writers’ usage. The semantic analysis of all main clause instances of could in Zemach and Islam (2005), which was used by 55.7 % of the students (with the rest using a newer version of the same textbook), yields the following absolute and relative distribution of uses, with JUEC and AmE06_L included for the sake of comparison in Table 2 (normalized frequencies per 10,000 words not given because an EAP textbook, including instructions and in this case also word lists, is a very different genre than actual English academic prose).^[13]

Table 2:

Semantic analysis of all tokens of could in Zemach and Islam (2005).

	Zemach and Islam (2005)		JUEC		AmE06_L
	Raw freq	%	Raw freq	%	Raw freq	%
Epistemic possibility	3	10.71	3	2.03	20	42.55
Ability	17	60.71	143	96.62	21	44.68
Other	8	28.57	2	1.35	6	12.77

Total	28	100.00	148	100.00	47	100.00

The semantic analysis of Zemach and Islam (2005) tells us that insofar as this textbook is a reasonable proxy for the teaching as a whole, it can be held partly responsible for the deviations observed in JUEC from AmE usage, represented by AmE06_L. The textbook underrepresents the “epistemic” use and overrepresents the “ability” use, compared to AmE academic prose, which may help explain the Japanese learners’ linguistic behavior, although the learners appear to deviate from AmE for both meanings to a greater extent than the textbook. One pedagogical implication here may be that texts used in teaching materials would ideally reflect authentic usage to a higher degree.

Regarding “ability”, it is unsurprising but worth pointing out that the “ability” uses in Zemach and Islam (2005) do not contain any tokens of the unacceptable “past performance on one occasion” use (see (11)–(12) and associated discussion); we explain, above, that instances of this unacceptable use may be due to L1 influence and how this can be mitigated in teaching.

In addition to the actual language used by Zemach and Islam (2005), we may consider some grammatical topics they single out for special attention. For example, the authors offer quite an elaborate discussion of how to describe cause and effect (see their Units 7 and 11, which cover expressions such as because, so, therefore, and consequently). By contrast, there is no discussion on hedging one’s claims, which would have provided an appropriate context in which to explain the role of modal verbs and (some of) their (main) uses, especially in academic discourse.

Zemach and Islam’s (2005, 2011 disregard of modals and their use in academic prose is far from unique. Some examples of other textbooks from major publishers (all aimed at similar CEFR levels around A2 or B1) that display a similar gap are Blanchard and Root (2017), Folse et al. (2020), and Singleton (2022). However, one exception in this respect is Frodesen and Wald (2016). Each chapter in this book starts with a “raising language awareness” section, in which the authors draw attention to some lexicogrammatical structure and its importance in (academic) writing. Thus, Frodesen and Wald’s chapter that includes modals couches the discussion and exercises in a broader discussion of hedges (2016: 175–180). The authors observe that “[i]n academic writing, writers often need to qualify statements with vocabulary that expresses degrees of certainty or accuracy about the information they are conveying” (Frodesen and Wald 2016: 175). They single out modal verbs for special attention by discussing their use in academic prose in some detail (Frodesen and Wald 2016: 178–179) but their exercises also include epistemic stance adverbials such as typically and the copular verb seem (ibid.: 177), which commonly marks likelihood in academic writing (Biber et al. 2021: 438).

We would highlight Frodesen and Wald (2016: 175–180) as an instance of good pedagogical practice and would recommend that other textbook authors follow their example in relation to English modal verbs. Furthermore, our suggestion for EAP instructors is that they may wish to refer to Frodesen and Wald or prepare similar materials themselves.

Related to the lack of discussion of hedging, although Zemach and Islam (2005) clearly set out to develop students’ academic writing,^[14] the texts that students are asked to engage with and to produce include topics such as “gifts”, “a trend”, and so on. As these topics are not academic the texts are typically narrative – and it is well known that narrative texts differ in many respects, including the use of modal verbs, from academic prose (cf. Biber 1988).

Japanese students’ use of could, meaning “past ability”, in 11 different topics shows that the most frequent use is seen in their writing about “interesting or unusual experiences” (41 tokens, 31.1 %), followed by writing about “a difficult decision” (27, 20.5 %), and then in writing about “explanations and excuses” (19, 14.4 %). As for the topic with the highest use of could, the high frequency may be largely due to the purpose of writing a narrative story and the fact that students were likely to write about their personal experiences in the past. Even after excluding student writing on these three topics, the Japanese learners’ use of this modal is still significantly more frequent than that observed in the AmE data. Topics, then, may have an influence on Japanese students’ use of could. We would encourage EAP materials developers and instructors to focus more on academic topics and texts (e.g., on the relation between nutrition and health, or on climate change; cf. Frodesen and Wald 2016: 176, 179).

In explaining why EFL learners’ academic English is relatively informal, Gilquin and Paquot (2008) have made a useful start in unpacking teaching-induced factors into different types, specifically, (i) lists of expressions that appear to suggest full synonymy when there are in fact subtle semantic and/or register differences and (ii) translations of examples from the L1 into the L2 that are not sufficiently sensitive to register appropriacy in the L2 (2008: 54–55). To this we add three more types: (iii) significant lack of faithfulness to the register of academic discourse with respect to the frequency of a particular expression or usage in texts and exercises,^[15] (iv) lack of explicit discussion of an area of grammar that is important to academic prose and/or known to be difficult, and (v) lack of representation of academic texts.

The fourth and final factor that Gilquin and Paquot (2008) suggest may contribute to academic English produced by learners, is development, by which they mean development of the mastery of academic prose in learners’ L1. Based on data from the Louvain Corpus of Native English Essays (LOCNESS)^[16] they show that native speaker students often approximate expert academic writing in their L1 more closely than non-native students in their L2, but that there is nevertheless a noticeable tendency toward “spoken-like features” even in the L1 (2008: 56).

Studies on the acquisition of Japanese academic prose by Japanese university students (e.g., Akiyama 2021; Ishiguro 2011; Yamaji et al. 2013) suggest that there is a similar tendency toward speech-like features. For example, Ishiguro (2011: 15) notes that this tendency of the students seems to be accelerating with the spread of media such as text messages, blog, and Twitter in which spoken language is used. Based on these studies and our own, we suggest that when teaching Japanese university students, more emphasis should be placed on appropriate use of formal grammatical features, including modal verbs, in their academic prose both in Japanese and in English.

6 Conclusions

The analysis of our JUEC data suggests that the writing of the Japanese undergraduate students we targeted could be characterized as colloquial and not sufficiently academic, especially judging from the learners’ use of could. For may and might with the “epistemic possibility” meaning, we do not in fact find any significant difference between our learner data (JUEC) and the AmE reference corpus (AmE06_L). The Japanese students used could much more frequently than the AmE writers. However, they used it less frequently as a marker of “epistemic possibility”, which is an important use of this modal verb in academic writing. They used could very frequently with the meaning of “past ability”. This modal turns out to be especially problematic for Japanese learners.

To explain our findings, we turn to a set of four factors usefully summarized in Gilquin and Paquot’s (2008) study on the typically relatively colloquial nature of EFL learners’ academic prose. We conclude that the topics Japanese students are engaged with may cause (at least in part) their frequent use of could. L1 interference also seems to play an important role. Japanese dekita and could differ in the scope of their meaning, with dekita covering both “past ability”, which could can also signal, and the “past performance on one occasion” meaning, which in American (and British) English it cannot.^[17]

We have proposed a range of pedagogical implications. EAP materials writers and/or instructors should focus on differences between modals in the L1 and in English; explicitly cover modals as hedges and discuss hedging in academic writing; ensure that usage of modal verbs in teaching materials is analogous to authentic texts; select texts and topics that are academic in nature; and practice formal writing also in the students’ L1.

Corresponding author: Willem B. Hollmann, Lancaster University, Lancaster, UK, E-mail: w.hollmann@lancaster.ac.uk

Funding source: Japan Society for the Promotion of Science KAKENHI

Award Identifier / Grant number: JP19K00806

Award Identifier / Grant number: JP23K00757

Research funding: This work was supported by Japan Society for the Promotion of Science KAKENHI (JP19K00806, JP23K00757).

Appendix A: Topics in Zemach and Islam (2005, 2011 and number of texts in JUEC

	Topics	Number of texts	Number of words
Unit 2	Gifts	71	7,601
Unit 3	Places	75	8,589
Unit 4	People	72	7,816
Unit 5	A trend	71	8,654
Unit 6	Your opinions	71	8,832
Unit 7	Explanations and excuses	64	7,095
Unit 8	Problems or difficulties	65	7,863
Unit 9	Interesting or unusual experiences	68	8,524
Unit 10	Life changes	60	7,498
Unit 11	A difficult decision	66	8,231
Unit 12	The future	58	8,360

Total		741	89,063

Note. The number of texts in each unit differs because some students did not submit their assignments.

Appendix B: Semantic analysis of may, might, and could in JUEC and AmE06_L

Number of words		JUEC		AmE06_L		LL	% DIFF
		89,063		185,506
		Raw freq	Per 10,000 words	Raw freq	Per 10,000 words
May	Epistemic possibility	45	5.05	116	6.25	1.52	−19.20
May	Other	3	0.34	26	1.40	7.85**	−75.97
Might	Epistemic possibility	26	2.92	41	2.21	1.20	32.08
Might	Other	0	0.00	8	0.43	6.27*	−100.00
Could	Epistemic possibility	3	0.34	20	1.08	4.63*	−68.76

	Ability	143	16.06	21	1.13	212.95***	1318.33
	Other	2	0.22	6	0.32	0.21	−30.57

Total		222	24.93	238	12.83	49.39***	94.28

Note. LL = log likelihood values. Negative values of %DIFF indicate that the frequency is lower in JUEC than in AmE06_L. Differences significant at the p < 0.05 level are marked with one asterisk, those at p < 0.01 with two asterisks, and those at p < 0.001 and p < 0.0001 with three.

References

Aijmer, Karin. 2002. Modality in advanced Swedish learners’ written interlanguage. In Sylviane Granger, Joseph Hung & Stephanie Petch-Tyson (eds.), Computer learner corpora, second language acquisition and foreign language teaching, 55–76. Amsterdam: John Benjamins.10.1075/lllt.6.07aijSearch in Google Scholar

Akiyama, Eiji. 2021. Hanashi Kotoba Kaki Kotoba ni Taisuru Daigakusei Kōkōsei no Ninshiki [University and high school students’ understanding of spoken and written language]. The Bulletin of the Faculty of Law and Letters. Humanities 50. 33–60.Search in Google Scholar

Altenberg, Bengt. 1997. Exploring the Swedish component of the International Corpus of Learner English. PALC 97. 119–132.Search in Google Scholar

Anthony, Laurence. 2020. AntConc (Version 3.5.9) [Computer software]. Tokyo, Japan: Waseda University. Available at: http://www.antlab.sci.waseda.ac.jp/.Search in Google Scholar

Aull, Laura. 2015. First-year university writing: A corpus-based study with implications for pedagogy. New York: Palgrave Macmillan.10.1057/9781137350466Search in Google Scholar

Aull, Laura L. 2019. Generality and certainty in undergraduate writing over time: A corpus study of epistemic stance across levels, disciplines, and genders. In Anne Ruggles Gere (ed.), Developing writers in higher education: A longitudinal study, 139–162. Ann Arbor: University of Michigan Press.Search in Google Scholar

Aull, Laura L. & Zak Lancaster. 2014. Linguistic markers of stance in early and advanced academic writing: A corpus-based comparison. Written Communication 31(2). 151–183. https://doi.org/10.1177/0741088314527055.Search in Google Scholar

Aull, Laura L., Dineth Bandarage & Meredith Richardson Miller. 2017. Generality in student and expert epistemic stance: A corpus analysis of first-year, upper-level, and published academic writing. Journal of English for Academic Purposes 26. 29–41. https://doi.org/10.1016/j.jeap.2017.01.005.Search in Google Scholar

Becker, Kimberly & Hui-Hsien Feng. 2020. Stance in unpublished student writing: An exploratory study of modal verbs in MICUSP’s physical science papers. In Römer Ute, Cortes Viviana & Eric Friginal (eds.), Advances in corpus-based research on academic writing, 255–278. Amsterdam: John Benjamins Publishing Company.10.1075/scl.95.11becSearch in Google Scholar

Biber, Douglas. 1988. Variation across speech and writing. Cambridge: Cambridge University Press.10.1017/CBO9780511621024Search in Google Scholar

Biber, Douglas, Stig Johansson, Geoffrey N. Leech, Susan Conrad & Edward Finegan. 2021. Grammar of spoken and written English. Amsterdam: John Benjamins Publishing Company.10.1075/z.232Search in Google Scholar

Biber, Douglas & Randi Reppen. 2002. What does frequency have to do with grammar teaching? Studies in Second Language Acquisition 24(2). 199–208. https://doi.org/10.1017/s0272263102002048.Search in Google Scholar

Blanchard, Karen & Christine Root. 2017. Ready to write 2: Perfecting paragraphs, 5th edn. New Jersey: Pearson Education.Search in Google Scholar

Breeze, Ruth. 2008. Researching simplicity and sophistication in student writing. International Journal of English Studies 8. 51–66.Search in Google Scholar

Cheng, Lauretta S. P., Danielle Burgess, Natasha Vernooij, Cecilia Solís-Barroso, Ashley McDermott & Savithry Namboodiripad. 2021. The problematic concept of native speaker in psycholinguistics: Replacing vague and harmful terminology with inclusive and accurate measures. Policy and Practice Reviews 12. 1–22. https://doi.org/10.3389/fpsyg.2021.715843.Search in Google Scholar

Davies, Mark. 2009. The 385+ million word Corpus of Contemporary American English (1990–2008+): Design, architecture, and linguistic insights. International Journal of Corpus Linguistics 14(2). 159–190. https://doi.org/10.1075/ijcl.14.2.02dav.Search in Google Scholar

Declerck, Renaat. 1991. A comprehensive descriptive grammar of English. Tokyo: Kaitakusha.Search in Google Scholar

Depraetere, Ilse & Chad Langford. 2020. Advanced English grammar: A linguistic approach, 2nd edn. London: Bloomsbury Academic.Search in Google Scholar

Downing, Angela. 2015. English grammar: A university course, 3rd edn. Oxford: Routledge.Search in Google Scholar

Folse, Keith S., April Muchmore-Vokoun & Elena Vestri. 2020. Great writing 2: Great paragraphs, 5th edn. Boston: National Geographic Learning.Search in Google Scholar

Friginal, Eric. 2018. Corpus linguistics for English teachers: Tools, online resources, and classroom activities. New York: Routledge.10.4324/9781315649054Search in Google Scholar

Frodesen, Jan & Margi Wald. 2016. Exploring options in academic writing: Effective vocabulary and grammar use. Ann Arbor: University of Michigan Press.10.3998/mpub.1150364Search in Google Scholar

Gilquin, Gaëtanelle. 2022. One norm to rule them all? Corpus-derived norms in learner corpus research and foreign language teaching. Language Teaching 55(1). 87–99. https://doi.org/10.1017/s0261444821000094.Search in Google Scholar

Gilquin, Gaëtanelle & Sylviane Granger. 2015. Learner language. In Douglas Biber & Randi Reppen (eds.), The Cambridge handbook of English corpus linguistics, 418–435. Cambridge: Cambridge University Press.10.1017/CBO9781139764377.024Search in Google Scholar

Gilquin, Gaëtanelle & Magali Paquot. 2008. Too chatty: Learner academic writing and register variation. English Text Construction 1(1). 41–61. https://doi.org/10.1075/etc.1.1.05gil.Search in Google Scholar

Granger, Sylviane. 1996. From CA to CIA and back: An integrated approach to computerized bilingual and learner corpora. In Aijmer Karin, Bengt Altenberg & Mats Johansson (eds.), Languages in contrast. Text-based cross-linguistic studies, 37–51. Lund: Lund University Press.Search in Google Scholar

Granger, Sylviane. 1998. The computer learner corpus: A versatile new source of data for SLA research. In Sylviane Granger (ed.), Learner English on computer, 3–18. Harlow: Wesley Addison Longman.10.4324/9781315841342-1Search in Google Scholar

Granger, Sylviane. 2015. Contrastive interlanguage analysis: A reappraisal. International Journal of Learner Corpus Research 1(1). 7–24. https://doi.org/10.1075/ijlcr.1.1.01gra.Search in Google Scholar

Granger, Sylviane & Paul Rayson. 1998. Automatic profiling of learner texts. In Sylviane Granger (ed.), Learner English on computer, 119–131. Harlow: Addison Wesley Longman.10.4324/9781315841342-9Search in Google Scholar

Hinkel, Eli. 2020. Teaching academic L2 writing: Practical techniques in vocabulary and grammar, 2nd edn. New York: Routledge.10.4324/9780429437946Search in Google Scholar

Holmes, Janet. 1988. Doubt and certainty in ESL textbooks. Applied Linguistics 9(1). 21–44. https://doi.org/10.1093/applin/9.1.21.Search in Google Scholar

Hunston, Susan. 2002. Corpora in applied linguistics. Cambridge: Cambridge University Press.10.1017/CBO9781139524773Search in Google Scholar

Hyland, Ken. 2016. Writing with attitude: Conveying a stance in academic texts. In Eli Hinkel (ed.), Teaching English grammar to speakers of other languages, 246–265. New York: Routledge.Search in Google Scholar

Ishiguro, Kei. 2011. Hanashi Kotoba to Kaki Kotoba – Sho Nenji Kyōiku no Kiso Shiryō to Shite [How to improve unnatural spoken phrases in Japanese compositions written by L1 Japanese undergraduate students]. Gengo Bunka 48. 15–35.Search in Google Scholar

Ishikawa, Shin’ichiro. 2015. Gakushūsha kōpus Ⅱ: Kokunai ni okeru Eigo gakushūsha kōpasu no kaihatsu to kenkyū [Learner corpus II: Development of English learners’ corpora and studies in Japan]. In Yukio Tono (ed.), Kōpasu to Eigo kyōiku [Corpus and English education], 99–129. Tokyo: Hituzi Syobo.Search in Google Scholar

Japan Exchange and Teaching Programme [JET]. 2022. Number of participants by country. https://jetprogramme.org/wp-content/MAIN-PAGE/intro/participating/2022_jetstats_e.pdf (accessed 16 April 2023).Search in Google Scholar

Larsen-Freeman, Diane & Marianne Celce-Murcia. 2016. The grammar book. Form, meaning and use for English language teachers, 3rd edn. Boston: National Geographic Learning.Search in Google Scholar

Leech, Geoffrey, Marianne Hundt, Christian Mair & Nicholas Smith. 2009. Change in contemporary English: A grammatical study. Cambridge: Cambridge University Press.10.1017/CBO9780511642210Search in Google Scholar

Leedham, Maria Elizabeth. 2011. A corpus-driven study of features of Chinese students’ undergraduate writing in UK universities. Milton Keynes: Open University PhD thesis.Search in Google Scholar

Lowe, Robert J. & Richard Pinner. 2016. Finding the connections between native-speakerism and authenticity. Applied Linguistics Review 7(1). 27–52. https://doi.org/10.1515/applirev-2016-0002.Search in Google Scholar

McCrostie, James. 2008. Writer visibility in EFL academic writing: A corpus-based study. ICAME Journal 32. 97–114.Search in Google Scholar

Ministry of Education, Culture, Sports, Science and Technology [MEXT]. 2015. Eigo no Shikaku-kentei Shiken no Katsuyō Sokuhsin ni Kansuru Kōdō Shishin An [Action guidelines for the promotion of the use of English language qualifications and examinations]. https://www.mext.go.jp/component/b_menu/shingi/giji/__icsFiles/afieldfile/2015/03/25/1356121_02.pdf (accessed 16 April 2023).Search in Google Scholar

Nishitani, Kohei & Takashi Nakazaki. 2015. Nihongo no Yōhō Kubun ni Kiin Suru Eigo no Goshutsuryoku – “Dekita” to “Could” no Hitaishōsei [English incorrect output caused by usage classifications in Japanese – the asymmetry of “dekita” and “could”]. Persica 42. 45–57.Search in Google Scholar

Nuyts, Jan & Johan van der Auwera. 2016. The Oxford handbook of modality and mood. Oxford: Oxford University Press.10.1093/oxfordhb/9780199591435.013.4Search in Google Scholar

Özhan, Diden. 2012. A comparative analysis of the use of but, however and although in the university students’ argumentative essays: A corpus-based study on Turkish learners of English and American native speakers. Ankara: Middle East Technical University PhD thesis.Search in Google Scholar

Petch-Tyson, Stephanie. 1998. Writer/reader visibility in EFL written discourse. In Sylviane Granger (ed.), Learner English on computer, 107–118. Harlow: Addison Wesley Longman.10.4324/9781315841342-8Search in Google Scholar

Polio, Charlene & Hyung-Jo Yoon. 2021. Exploring multi-word combinations as measures of linguistic accuracy in second language writing. In Bert Le Bruyn & Magali Paquot (eds.), Learner corpus research meets second language acquisition, 96–121. Cambridge: Cambridge University Press.10.1017/9781108674577.006Search in Google Scholar

Poole, Robert, Andrew Gnann & Gus Hahn-Powell. 2019. Epistemic stance and the construction of knowledge in science writing: A diachronic corpus study. Journal of English for Academic Purposes 42. 100784. https://doi.org/10.1016/j.jeap.2019.100784.Search in Google Scholar

Potts, Amanda & Paul Baker. 2012. Does semantic tagging identify cultural change in British and American English? International Journal of Corpus Linguistics 17(3). 295–324. https://doi.org/10.1075/ijcl.17.3.01pot.Search in Google Scholar

Rapp, Reinhard, Pierre Zweigenbaum & Serge Sharoff. 2022. Proceedings of the LREC 2022 15th Workshop on Building and Using Comparable Corpora (BUCC 2022). 15th Workshop on Building and Using Comparable Corpora (BUCC 2022), 979-10-95546-94-8, hal-03876674.Search in Google Scholar

Rayson, Paul & Roger Garside. 2000. Comparing corpora using frequency profiling. In Proceedings of the workshop on Comparing Corpora, held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics, Hong Kong.10.3115/1117729.1117730Search in Google Scholar

Römer, Ute & Matthew Brook O’Donnell. 2011. From student hard drive to web corpus (part 1): The design, compilation and genre classification of the Michigan Corpus of Upper-level Student Papers (MICUSP). Corpora 6(2). 159–177. https://doi.org/10.3366/cor.2011.0011.Search in Google Scholar

Schmidt, Richard W. 1990. The role of consciousness in second language learning. Applied Linguistics 11(2). 129–158. https://doi.org/10.1093/applin/11.2.129.Search in Google Scholar

Schmidt, Richard. 2012. Attention, awareness, and individual differences in language learning. In Wai Meng Chan, Kwee Nyet Chin, Sunil Kumar Bhatt & Izumi Walker (eds.), Perspectives on individual characteristics and foreign language education, 27–50. Berlin: Walter de Gruyter.10.1515/9781614510932.27Search in Google Scholar

Singleton, Jill. 2022. Writers at work: The paragraph. Cambridge: Cambridge University Press & Assessment.Search in Google Scholar

Tåqvist, Marie Kristin. 2018. “A wise decision”: Pre-modification of discourse-organising nouns in L2 writing. Journal of Second Language Writing 41. 14–26. https://doi.org/10.1016/j.jslw.2018.05.003.Search in Google Scholar

Tono, Yukio. 2020. CEFL-J wa Nihon no Eigo Kyōiku ni Oite Dono Yōni Katsuyō Sarete Iruka [How is the CEFR-J used in English education in Japan?]. In Yuko Tono & Masashi Negishi (eds.), Kyōzai tesuto sakusei no tame no CEFR-J risōsu bukku [CEFR-J resource book for teaching materials and test design], 24–27. Tokyo: Taishukan.Search in Google Scholar

Virtanen, Tuija. 1998. Direct questions in argumentative student writing. In Sylviane Granger (ed.), Learner English on computer, 94–106. Harlow: Addison Wesley Longman.10.4324/9781315841342-7Search in Google Scholar

Wulff, Stefanie & Stefan Th Gries. 2021. Exploring individual variation in learner corpus research: Methodological suggestions. In Bert Le Bruyn & Magali Paquot (eds.), Learner corpus research meets second language acquisition, 191–213. Cambridge: Cambridge University Press.10.1017/9781108674577.010Search in Google Scholar

Yamaji, Naoko, Kyoko Chinami & Hiroyuki Fujiki. 2013. Nihonjin Daigakusei no Kaki Kotoba Shūtoku – Sho Nenji to San Nenji ni Okeru Chōsa Kekka no Hikaku Kara – [Japanese undergraduates’ acquisition of academic written language: From a comparison of performances in a quiz as freshmen and as juniors]. Senmon Nihongo Kyōiku Kenkyu 15. 47–52.Search in Google Scholar

Zemach, Dorothy E. & Carlos Islam. 2005. Paragraph writing. Oxford: Macmillan Education.Search in Google Scholar

Zemach, Dorothy E. & Carlos Islam. 2011. Writing paragraphs. Oxford: Macmillan Education.Search in Google Scholar

Received: 2023-07-22

Accepted: 2023-10-29

Published Online: 2024-05-08

Published in Print: 2024-05-27

This work is licensed under the Creative Commons Attribution 4.0 International License.

Japanese EFL undergraduate students’ use of the epistemic modal verbs may, might, and could in academic writing

Abstract

1 Introduction

2 Previous studies of learners’ use of modal verbs

3 Methodology

3.1 Corpora

3.2 Meaning categories of modal verbs

4 Results

5 Discussion and pedagogical implications

6 Conclusions

Appendix A: Topics in Zemach and Islam (2005, 2011 and number of texts in JUEC

Appendix B: Semantic analysis of may, might, and could in JUEC and AmE06_L

References

Journal and Issue

Articles in the same Issue