New Approach to Teaching Japanese Pronunciation in the Digital Era Challenges and Practices

Pronunciation has been a black hole in the L2 Japanese classroom on account of a lack of class time, teacher’s confidence, and consciousness of the need to teach pronunciation, among other reasons. The absence of pronunciation instruction is reported to result in fossilized pronunciation errors, communication problems, and learner frustration. With an intention of making a contribution to improve such circumstances, this paper aims at three goals. First, it discusses the importance, necessity, and effectiveness of teaching prosodic aspects of Japanese pronunciation from an early stage in acquisition. Second, it shows that Japanese prosody is challenging because of its typological rareness, regardless of the L1 backgrounds of learners. Third and finally, it introduces a new approach to teaching L2 pronunciation with the goal of developing L2 comprehensibility by focusing on essential prosodic features, which is followed by discussions on key issues concerning how to implement the new approach both inside and outside the classroom in the digital era.

1 Introduction jority of textbooks do not deal with pronunciation systematically. No lexical accent information is provided for new vocabulary except in China (Abe et al. 2017). Fourth, teachers may not feel confident either about their knowledge of Japanese pronunciation or their ability to evaluate students' pronunciation and correct their problems. The latter problem is considered true particularly for non-native-speaking teachers who constitute 70% of the total number of Japanese teachers in the world (Japan Foundation 2015).
The lack of pronunciation instruction could result in learners' difficulties with pronunciation regardless of their L1 backgrounds and proficiency levels, as shown by Toda (2008a;2009). She conducted a wide-scale survey among 1,216 international students from 47 different countries who enrolled in Japanese pronunciation courses offered by Waseda University (Tokyo, Japan). In their responses, they were able to describe their specific pronunciation errors which they knew how to correct. Their voices also showed how pronunciation problems could hinder not only smooth communication but also learning other linguistic aspects (e.g. vocabulary, listening comprehension). Limited pronunciation skills could lower learners' self-confidence and result in negative effects for learners in estimating their own social credibility and abilities (Morley 1998). This is indeed the case for the participants in the survey as well. They received comments from native speakers, such as that they sounded funny or strange, what they said was not understandable and so on. Such experiences caused frustration, ruining their sense of accomplishment.
It seems reasonable to generalize the problems reported in Toda's study as common among JFL learners in Europe. With the intention of making a contribution to improve the situation, this paper has three goals. The first is to show the importance, necessity, and effectiveness of teaching prosodic aspects of Japanese pronunciation from an early stage of acquisition, which is particularly the case for JFL learners whose L1 backgrounds are major European languages. The second is to provide a panoramic overview of major characteristics of Japanese prosody, which is challenging for leaners to acquire because of its typological uniqueness. The third is to introduce a rising approach to teaching Japanese pronunciation in the digital era that is aimed at developing learners' autonomous learning skills.

Prosody First
What is prosody? The word 'prosody' derives from ancient Greek, where it was used for a "song sung with instrumental music" (Nootboom 1997). Indeed, prosody is sometimes called the "musical" aspect of speech, since it involves variables used to describe music such as pitch contours (melody), rhythm, phrasing, emphasis, timbre (voice quality), silence (pause), and so on (Truax 2000, 39). In modern phonetics the word 'prosody' and its adjectival form 'prosodic' are used to refer to those properties of speech extending over larger units of speech than individual sounds or segments. Prosody is also commonly called suprasegmentals. Frontier approaches to teaching Japanese pronunciation put emphasis on prosody, not on segments, as shown by the dominance of prosody exercises in major textbooks of Japanese pronunciation, as shown in Table 1 that lists the titles of units of Hitori demo manaberu nihongo no hatsuon ('Learn Japanese Pronunciation by Yourselves', Kinoshita, Nakagawa 2019). Units to exercise individual sounds are in gray, occupying only one unit in each textbook. Much more emphasis is placed on prosody than segments for several reasons. The first reason is because Japanese is much more difficult to learn at the prosodic level than at the segmental level. Japanese is one of languages whose phonetic inventory is relatively small. This means that Japanese phonemes are likely to be available in learners' L1s, which is very likely to result in less probability for errors. 3 This is also the case for five representative European languages -i.e. English, German, French, Italian and Spanish -whose phonemic inventory is larger than that of Japanese, as shown in Table 2.
The second reason is the great importance of prosody in speech communication. Prosody conveys not only a broad range of linguistic information such as lexical information, syntactic information, chunking the stream of speech in phrases, signaling new and contrastive information and disambiguating sentences, but also rich paralinguistic information, i.e. information related to the identity, age, gender, and emotional state of the speaker (Lengeris 2012). Prosodic information present in fluent speech helps the listener perceive the utterances. Words, like musical notes, are grouped together into phrases by their rhythmic and durational properties as well as their tonal pitch. This organization of prosodic phrasing (grouping of words within an utterance) affects the understanding of sentences (Frazier, Carlson, Clifton 2006). The comprehension of spoken language is a complex skill to map the acoustic signals of the speaker's output onto linguistic units, such as phonemes, syllables, and words 3 As for learners whose L1s are Romance languages, the exception is /h/. In Romance languages, the alphabetic letter h is not pronounced (e.g. homme /om/ 'man' in French; ha /a/ 'have (third person singular)' in Italian; hola /ola/ 'hello' in Spanish). This characteristic tends to result in errors such that /h/ is not pronounced when speakers of those languages learn a language with phonemic /h/: e.g. English hair and Japanese hai 'yes' pronounced as air and ai, respectively. However, note that the availability of phonemes of a target language does not guarantee segmental acquisition, although it tends to facilitate, as shown by /h/ lenition observed among Indonesian learners of Japanese although /h/ is phonemic in Indonesian (Hatasa, Takahashi, Ito 2016). (Ayasse, Alexis, Wingfield 2018). At the sentence level, prosodic cues help the listener to detect the lexical meanings of the expressions uttered, determine the syntactic structure of the utterance, and comprehend utterance meanings. The importance of prosodic functions is also held in L2 speech communication. In the perception of L2 speech by L1 speakers, prosody plays a more significant role than individual sounds, as reported in earlier research. Studies comparing the relative contribution of segmental vs. prosodic features in degree of foreign accent have shown that deviations in prosodic features may affect listeners' judgement more than deviations in segmental features (Lengeris 2012). Specifically, prosody has been found to be linked to intelligibility, comprehensibility, and accentedness of L2 speech (for a review, among others, see Lengeris 2012). This is also the case for L2 Japanese. A stronger effect has been found not only on native Japanese speakers' evaluation of L2 Japanese pronunciation (Sato 1995) but also on the comprehensibility and naturalness of L2 Japanese pronunciation as perceived by L1 Japanese speakers (Kato et al. 2012;Saito, Akiyama 2017). Table 1 Titles of units of Hitori demo manaberu nihongo no hatsuon ('Learn Japanese Pronunciation by Yourselves', Kinoshita, Nakagawa 2019) 4 1) Introduction -prepare self-learning -how to use OJAD and Praat 2) Slash reading 1 -comprehensible and intelligible intonation 3) Slash reading 2 -how to express emotions 4) Noun and adjective accent -pitch control 5) Verb accent -mountain-shaped, plateau-shaped accent and intonation 6) Sentence-final intonation 1 -ka, ne, yo 7) Sentence-final intonation -janai, yone, kana, kane 8) Rhythm 1 -long vowel -geminate -coda nasal N 9) Rhythm 2 -rhythmic patterns of words and -senryū 10) Vowels and consonants -how to pronounce 11) Sound changes -gender and dialect variations 12) How to express feelings 1 -politeness 13) How to express feelings 2-roles and characters 14) Conclusion -toward future pronunciation learning The third reason is its overall complex nature, as shown earlier.
Prosody consists of multiple components; two essentials are tonal and rhythmic components. In modern linguistic theories, both components are considered hierarchical with different levels of organization (for reviews, see Jun 2005 for intonation; Arvaniti 2009 for rhythm). Every language has prosodic grouping and prosodic prominence at multiple levels (word, phrase, utterance), but different languages use them in very different ways (Jun 2005). Languages differ in their inventory of prosodic units, way of grouping prosodic units (i.e. prosodic grouping), and way of expressing prosodic prominence. To learn and teach Japanese pronunciation, it must be worthwhile and helpful to be familiar with essential characteristics of Japanese prosody.
Lastly, more attention needs to be paid to prosodic aspects since Japanese prosody is particularly challenging to learn regardless of learners' L1 backgrounds, as is well known from both research and practice. Earlier studies have shown that common errors that occur regardless of learners' L1 backgrounds are related to prosody, as opposed to segmental errors that tend to be found among learners whose L1s are specific 5 (Kondo 2011).

Typologically Unique Characteristics of Japanese Prosody
Why is it so challenging to learn Japanese prosody? It is mainly due to its typologically unique characteristics (Hayashi 2018). Among a wide range of its functions, two key essential elements of Japanese 5 For example, the aforementioned problem of silent /h/ tends to occur only in the production of L2 Japanese by learners whose L1s do not have /h/ (e.g. French, Italian, and Spanish), while this error tends not to occur among learners whose L1s have /h/. Similarly, Korean learners of Japanese have difficulty in pronouncing ザ、ズ、 ゼ、ゾ /za, zu, ze, zo/ since /z/ is not a part of the Korean phonemic inventory.
prosody at the lexical level are lexical pitch accent and mora. Both have lexical functions: i.e. they are used to distinguish meanings of words. Japanese intonation is generated on the basis of the distribution of lexical pitch accent, while rhythm is organized with the use of mora as a basic unit. The mora is subsyllabic, the smallest of prosodic units of languages. For this, Japanese is classified as a pitch-accent language from a pitch or tonal point of view and as a mora-timed language from a rhythmic point of view. Japanese is one of the very rare languages classified as a pitch-accent, mora-timed language. This typological uniqueness tends to result in difficulties in learning Japanese prosody for learners whose L1s have very different prosodic characteristics.
The specific elements of L2 Japanese speech that are difficult for a large number of learners to acquire are the properties of Japanese speech that are typologically unique: the phonemic length contrasts for both consonants and vowels and pitch accent. Both are listed as the most common problems by Toda (2009) and Kondo (2011), based on results of their wide-scale surveys. 6 To further understand the nature of learning difficulties, Japanese phonemic length contrasts and pitch accent will be explained mainly from a typological point of view in the rest of this section, considering essential phonological and phonetic characteristics within the rhythmic and tonal components of Japanese speech. Issues related to the L2 acquisition of those properties will be also mentioned.

Mora-timed Temporal Organization
In terms of rhythm or temporal organization, languages are typologically classified into three types: stressed-timed, syllable-timed, and mora-timed (for reviews, among many others, Dauer 1983;Arvaniti 2009). This classification is originally proposed, on the basis of the isochrony hypothesis that claims two points: 1) every language belongs to one particular rhythm type; and 2) rhythm types are defined in terms of a timing unit (syllable, foot, mora) that is of equal duration. A number of studies were conducted to test the theory, but ex-perimental results were too inconsistent to support the theory (Dauer 1983;Arvaniti 2009). Later studies have proposed that different timing types are characterized rather by the combination of different properties of speech, such as syllable structure, the distribution of word stress/accent, the phonetic realization of lexically prominent syllables (e.g. stressed and pitch-accented syllables in English and Japanese, respectively), the reduction of unstressed/unaccented syllables, and so on (Arvaniti 2009). In stress-timed languages (e.g. English and German), more complex syllable structures are found in stressed syllables, and syllables with more complex structures tend to be stressed (Dauer 1983), and are longer in duration and greater in intensity. Syllable structure and stress are more likely to reinforce each other in a stresstimed than a syllable-timed language. The inventory of syllable types is more limited in syllable-timed languages like French, Italian, and Spanish with less vowel reduction, and even more limited in moratimed languages like Japanese with neither accentual lengthening nor vowel reduction (e.g. Beckman 1986). The proportion of CV, a syllable consisting of one consonant preceding a vowel, is remarkably higher in Japanese than in English and Spanish, thus showing a smaller proportion due to a wider distribution among different types of syllables. These differences in multiple properties of speech lead to the greatest durational contrasts between stressed and unstressed syllables in stress-timed languages, the smallest durational contrasts between accented and unaccented syllables in Japanese (a mora-timed language that does not have a stress accent system), and somewhere in the middle in syllable-timed languages.
The majority of languages are stress-timed or syllable-timed, and only a few modern languages are classified as mora-timed. Japanese timing patterns are phonetically characterized by the small amount of durational malleability at the prosodic level (Ueyama 2012) mainly due to the absence of lengthening accented syllables and the absence of the vowel reduction of unstressed syllables. 7 These characteristics are difficult to acquire, especially for learners whose L1s are stress-timed languages with a greater amount of durational malleability that is manifested with stressed-syllable lengthening and unstressed-syllable reduction.

Ueyama Motoko
New Approach to Teaching Japanese Pronunciation in the Digital Era There has been continuous research on the L2 acquisition of phonemic length contrasts that is still ongoing. A set of general findings emerges from the results of earlier studies (see Hirata 2015 for a comprehensive review). At an early stage of acquisition, L1 backgrounds of learners affect perception significantly. Learners that are not familiar with phonemic length contrasts in their L1s do not perceive Japanese length contrasts with clear and stable categorical boundaries like native Japanese speakers do. Their perception varies depending on phonetic contexts such as lexical pitch accent, speech rate, and positions in a word (e.g. word-initial contrasts are the easiest to perceive), while L1 Japanese perception is stable. In contrast, learners whose L1s have phonemic lengths (e.g. Finnish, Arabic) can perceive Japanese length contrasts in a categorical way similar to that of native Japanese speakers. Vowel length tends to be easier to perceive than consonant length, but it is possible to learn to perceive both types of length contrasts eventually. Production is more challenging than perception: learners improve their production as they advance their study of Japanese, but there always seems to be individual variation in the degree of improvement.

Pitch-accent Language
The world's languages have been classified into three categories according to their use of pitch at the lexical level (Jun 2005): tone languages, stress (accent) languages, and pitch-accent languages. Japanese belongs to the third category. The exact percentage for each category is not available. However, the World Atlas of Language Structures/WALS (Haspelmath et al. 2005; Dryer, Haspelmath 2013), i.e. a large database of structural (phonological, grammatical, lexical) properties of languages that were gathered from descriptive materials (such as reference grammars) provides information on word prosody for 176 out of the 200 sample languages included in the database. Since the database is quite representative, we can have an idea of the distribution of the three types: 141 (80%) use stress, and 28 (16%) have only lexical tone or pitch accent (Goedemans 2010). This information indicates that pitch-accent languages such as Japanese are the minority. The majority of tone languages are spoken in Asia, such as Mandarin Chinese, Thai, and Vietnamese and also in Africa, such as Akan, Igbo, and Yoruba. Each tone language has its own inventory of lexical tones (contour tones and level/register tones in Asia and Africa, respectively) applied to a syllable that are phonologically distinctive by showing unique pitch patterns to convey different meanings. In contrast, in both stress (accent) languages and pitch-accent languages, only one syllable or mora in a word is more prominent than the others. A major difference between lexical stress and pitch accent 8 is in which acoustic correlate it is involved. In stress languages, word stress involves multiple acoustic parameters as in English where word stress is produced with changes in fundamental frequency (F0), intensity, duration, vowel quality, and so on (among others, Beckman 1986). In pitch-accent languages, lexical pitch accent is achieved principally by pitch change. Other than Japanese, major modern languages that have been identified as pitch-accent languages are Serbo-Croatian, Slovenian, Norwegian, Persian, Punjabi, Swedish, Western Basque, and certain dialects of Korean (among others, Jun 2005;Hyman 2006). Some have both pitch accent and stress,such as Serbo-Croatian, Norwegian, and Swedish, but Japanese has only pitch accent. Some have both lexically accented and accentless words, such as Japanese and Northern Bizkaian Basque. Lastly, some languages do not have any lexical tones, stress or accent; French and Seoul Korean belong to this category. In these languages, intonation patterns are determined only based on post-lexical tones (Jun 2005).
As mentioned earlier, a major difference between pitch accent and stress (accent) is that only pitch is involved for the former while multiple correlates exist for the latter. This difference explains one of 8 In a current common intonation model, the autosegmental-metrical (AM) approach (among others, see Ladd 1996;Jun 2005), the term pitch accent is used also for postlexical prominence that is assigned to a word in a certain speech context. Thus, there are two types of pitch accent: lexical accent and post-lexical pitch accent. Post-lexical pitch accents do not have distinctive functions without changing the lexical identity of the word, opposed to lexical pitch accents that are used to distinguish meanings, as in Japanese. the common problems among learners whose L1s are stress (accent) languages; they tend to use their L1 stress in place of the Japanese of pitch accent by lengthening vowels and/or using more intensity at an early stage of acquisition, as shown in past experimental studies (e.g. Ueyama 2012 for L2 Japanese-L1 English; Asano, Gubian 2018 for L2 Japanese-L1 German; Ueyama 2016 for L2 Italian-L1 Japanese). Since Japanese has phonemic vowel lengths, non-native extra durational stretch native Japanese speakers' perception of (C)V as (C)VV: e.g. /wakaˈɾɯ/ 'understand' is perceived as / wakaaˈɾɯ/ with long vowel /aa/. Learners whose L1s do not have a lexical accent face a different problem that is also caused by a difference between their L1s and Japanese. They simply use their L1 tonal shapes at the sentence level.

Japanese Lexical Accent: Characteristics and L2 Acquisition Issues
The lexical or word accent system of Tokyo Japanese 9 is characterized by the following principal properties. There is only one type of pitch accent, HL, 10 a high tone followed by a falling tone. The presence, absence, and position of pitch accent HL are contrastive (e.g. McCawley 1968, and many others). Only one HL pitch fall is allowed within a word, and pitch cannot rise again within the same word once it goes down: i.e. there can maximally be one prominence within a word (Kawahara 2015).
The following steps or rules are applied to have final tonal shapes. If there is no lexical accent on the first mora of the word, pitch rises from low to high from the first onto the second mora of a phraseinitial word (i.e. phrase-initial pitch rise), and pitch stays high up to the lexical accent. The three tonal patterns of two-mora words with three distinctive meanings are presented in [ fig. 1]: ha*shi-da 'they are chopsticks'; hashi*-da 'it's a bridge'; hashi-da 'it's an edge' (the underlined syllable is lexically accented while the asterisk marks the approximate location of pitch fall). These three tonal patterns can be abstractly represented by a sequence of high and low tones: H*LL, 9 Different Japanese dialects are characterized by varying characteristics of different properties of speech, including word accent.
10 H*+L is alternatively used to represent lexical pitch accent mainly in research works conducted with the AM approach as well as in the J-ToBI prosodic labeling scheme (Venditti 1997; for the extended version of J-ToBI, Maekawa et al. 2002). LH*L, LHH. 11 In this way, a tonal pattern of one-word unit, bunsetsu, 12 can be explained or predicted by a small set of steps once the presence, absence, and position of pitch accent HL are found.

3.2.3
The L2 Acquisition of Japanese Lexical Pitch Accent Lexical pitch accent is another element, along with phonemic length contrasts, that causes difficulty for learners of Japanese; L2 acquisition of lexical pitch accent has been studied extensively up to today (for reviews, see Hirata 2015; Hatasa, Takahashi, Ito 2016). Main findings that emerged from past studies are as follows. The degree of perceptual accuracy for different pitch accent patterns and effects of syllable structures depends on learners' L1s. The learners show higher accuracy in those pitch accent patterns that are similar to the prosodic patterns of their L1s, but accentless words and words with word accent on the last mora (e.g. LHH and LHH*, respectively, for threemora words with no word-internal pitch fall) are perceived more correctly than words with a pitch fall within a word (e.g. H*LL and LH*L). Perceptual accuracy varies across individuals, regardless of their history of Japanese language study. Even advanced speakers cannot reach close to a native level of perceptual accuracy, unlike the case of phonemic length contrasts that advanced speakers are able to perceive with native-like accuracy. A similar tendency is found also in 11 Using a binary system of H and L tones has been criticized since it is considered not to represent phonological and phonetic characteristics of Japanese accent patterns, and these days, it is more common to indicate only the position of pitch accent. However, the binary system is employed in this paper for the ease of explanations of L2 patterns in this section.
12 Bunsetsu is a morphological unit of accentuation consisting of a content word such as nouns, verbs, or adjectives with or without being followed by a string of function morphemes such as particles or postpositions, as defined by Hattori (cited in Nagano-Madsen 2015, 200). for English-speaking learners; Pappalardo 2017 for Italian-speaking learners). Commonly, the same learners produce more than one accent pattern for the same lexical item (Ueyama 2012;Hatasa, Takahashi, Ito 2016), which strongly reflects the nature of interlanguage that is characterized by non-systematic variability (Selinker 1972).
Similar to the acquisition of phonemic length contrasts, production seems to lag behind perception (Hirata 2015;Hatasa, Takahashi, Ito 2016). This general tendency is shown by Sakamoto (2010) who investigated both the production and perception of the same learners. "The experienced learners with an average of 3.7 years of Japanese study with 1 year of stay in Japan showed perception of the three types of pitch accent similar to that of NJs, but their production was still significantly lower than that of native speakers" (cited in Hirata 2015, 737). Difficult patterns to learn do not seem to be the same in production and perception, as pointed by Hatasa and colleagues (2016), based on the results of their comparison of findings of earlier research on the acquisition of Japanese pitch accent by Englishspeaking learners.
Lexical pitch accents are fundamental components of the phraselevel intonation of Japanese. Japanese intonation can be largely determined at the lexical level based on the distribution of lexical pitch accents and interacting with post-lexical processes such as phraseinitial pitch rise (see also § 3.2.2) and downstep 13 (Pierrehumbert, Beckman 1988;Venditti 1997;Maekawa et al. 2002;Jun 2005;Igarashi 2015). The lack of acquiring lexical pitch accents at the production level could result not only in misunderstandings of meanings of words but also in unnatural intonation at the sentence and discourse level, which may interfere with smooth communication by reducing the comprehensibility of learners' speech.

4
New Approach to Teaching Japanese Pronunciation The overview of the essential characteristics of Japanese prosody mainly at the lexical level has shown two points: 1) common difficulties in learning prosodic features are largely due to their typological uniqueness; and 2) such difficulties cannot be easily overcome by adding experiences of learning Japanese with no pronunciation instructions. Under these circumstances, the logical solution is to 13 Downstep is a phonological process in which the local pitch height of each accentual phrase typically consisting of one lexical word plus any following particles (Igarashi 2015), i.e. bunsetsu, is reduced when followed by a lexically accented phrase, which results in forming a staircase-like effect of accentual phrase heights in sequence (see Jun 2005). teach Japanese pronunciation from an early stage of learning. Then the next question is how? The following points are keys to answering the question. The first point is a general goal: i.e. teach comprehensible pronunciation. In the last decade, there have been on-going shifts in pedagogical implications for pronunciation teaching (Levis 2005, cited in Saito 2018). The traditional approach focuses equally on all L2 pronunciation features to teach native-like accurate pronunciation. The new approach instead focuses selectively on certain features affecting comprehensibility or intelligibility with a goal of teaching comprehensible L2 pronunciation based on the fact that many successful L2 speakers remain accented but highly comprehensible (Saito 2018). The goal of the new approach is also the basis of ongoing efforts to improve the problematic situation of teaching Japanese pronunciation in a classroom.
Second is the need to teach essential prosodic features in a classroom. This line of instruction was proven to be effective in Oyama's (2014) experimental classroom study. Eight sessions of 20 minutes were carried out for one month, focusing on selected prosodic features such as rhythm and mora length contrast, major characteristics of lexical accent patterns (e.g. pitch accent, accent type, compound accent, simple word accent) and those of intonation. In every session, students also practiced dialogues by focusing on accentual phrases. The comparison of results of pre-and post-tests showed the significant effects of instructions. As pointed out in the conclusion of the study, the important function of this type of training is to provide learners with metalinguistic knowledge about Japanese phonology and phonetics. As reported by Toda (2008bToda ( , 2008c, successful learners have learned such knowledge and utilize it to monitor their pronunciation critically. The third key issue is that teachers need to continue to follow up on instructions in classrooms, especially for pitch accent. Teachers are expected not only to continue to teach accent patterns of new vocabularies but also to have the ability to assess learners' pronunciation, explain problems, and carry out exercises to solve problems. However, it is evident that not all teachers have such an ability, including native Japanese teachers. Byun (2018) conducted a test of the perception of accent patterns among 126 Japanese students studying Japanese language education, and results showed that the percentage of correct answered ranged from 60 to 80%. The same study also showed the effects of a training that improved the percentage up to 90%. Similar difficulties are faced also by native Japanese teachers, as shown in Kanamura (2020)'s survey conducted among 69 teachers working in Japan that has unveiled the psychological block to teaching pronunciation among teachers. Kanamura expresses the importance and need to support Japanese teachers to improve their ability to assess the pitch of one's own voice and understand the difference between model and actual pronunciation.
The fourth point is to utilize digital resources, including teaching materials such as audio files, videos, websites ranging from pages designed to teach Japanese pronunciation (e.g. つたえるはつおん Tsutaeru hatsuon) 14 to digital tools to support pronunciation learning online or offline. The most frequently used tool worldwide may be the webbased system OJAD 15 (Online Japanese Accent Dictionary) developed by Minematsu and colleagues. A prosodic reading tutor Suzuki-kun, one of four OJAD features, is utilized not only by learners but also by teachers. Suzuki-kun visualizes intonation curves with pitch accent information for any given text, also generating speech models of a selected voice at three different speech rates with the use of speech synthesis technologies (Minematsu, Hirano, Nakamura 2018). The combination of a visual display of prosodic information and an audio model has proved to be very effective in improving the naturalness of L2 Japanese pronunciation, as shown by experimental evidence (Minematsu et al. 2016). Speech analysis tools (e.g. Praat, 16 WASP), 17 although not developed originally for teaching second language pronunciation, can be utilized by learners to check their pronunciation by comparing with audio models for acoustic patterns. Kinoshita and Nakagawa (2019) propose combining OJAD and Praat. This is currently one of the cutting-edge methods that cover three steps of learners' autonomous learning: i.e. visualize prosodic patterns, generate audio models with OJAD, and check learners' pronunciation by comparing with audio models for acoustic patterns with Praat.
To conclude, this paper has argued for the importance, necessity, and effectiveness of teaching Japanese pronunciation from an early stage of leaning, especially prosodic features, and then it has shown from a typological point of view why it is very difficult to learn Japanese prosody for the majority of learners, regardless of their L1 backgrounds. Last but not least, the new approach to teaching L2 pronunciation with a goal of developing L2 comprehensibility has been introduced, along with key issues concerning how to implement the approach both inside and outside the classroom in the digital era.