The impact of a short auditory training on L2 pronunciation in languages with different orthographic depth

It is sometimes assumed that the pronunciation of an L2 is more predictable and thus easier to learn if its orthography is transparent. This study aims to find out whether this assumption holds true in the first stages of L2 learning in languages with different orthographic depth. The study also examines the effect that a short auditory training (supported by simultaneous orthographic input) has on L2 pronunciation. A central finding was that the pronunciation of an L2 with a transparent orthography was not easier to learn for a naïve learner when compared to an L2 with an opaque orthography. A second finding was that even a short period of auditory training can introduce a significant improvement of a naïve L2 learner’s speech. Further, the results show that mimicking one specific native speaker could be an effective strategy for pronunciation learning. This method should be studied in more detail in future research.


Introduction
Languages differ with respect to the degree of orthographic transparency, also called orthographic depth [1,2,3].Orthographic depth refers to the phoneme-to-grapheme (or letter-to-phoneme) correspondence [4,5].A language with a transparent (shallow) orthography has a relatively consistent correspondence between orthography and pronunciation, whereas in a language with an opaque (deep) orthography the correspondence is relatively inconsistent [4].For example, in Finnish the phoneme-to-grapheme correspondence is transparent in the sense that graphemes often relate to phonemes in a way which is intuitively close and consistent [2,6].Thus, the grapheme <o> is pronounced in Finnish as [o], <oo> as [o:], <u> as [u], <uu> as [u:], <ee> as [e:], <i> as [ɪ] etc. almost without exceptions.In a language with an opaque orthographyfor example Swedishthe phoneme-to-grapheme correspondence is more inconsistent in the way that there are often several ways to pronounce one letter (feedforward, i.e., letter-to-phoneme inconsistency) and several ways to spell one sound (feedback, i.e., phoneme-toletter inconsistency).Problems arise especially when a grapheme represents some other sound that its regular correspondent; e.g., in Swedish, /ʃ/ can be spelled <skön>, <stjärna>, <skjuta>, <choklad>, <giraff>, <jalusi>, <shopping> and <garage> (Eng.lovely, star, to shoot, chocolate, giraffe, jalousie, shopping and garage).Thus, a Finnish second language (L2) learner of Swedish must learn many new grapheme-to-phoneme and phoneme-tographeme correspondences, and not to use some in his/her first language (L1), according to what learning of new orthography-pronunciation correspondences in L2 means [7].
When acquiring their L1, children learn pronunciation primarily from auditory input and oral communication [8].In turn, adult L2 learners are often simultaneously exposed to both auditory and orthographic input, and both types of input affect pronunciation learning.Orthographic input can affect both the perception and production of L2 speech [9].The impact of orthographic input on pronunciation has been proved in several studies on L2 acquisition.In a series of experiments, Bassetti and Atkinson showed that orthographic forms can affect even experienced learners' pronunciation of known words in L2: 85% of Italian learners of English added a phone to the target word (<walk> was pronounced with /l/ etc.) in reading aloud tasks, while 56% of the participants added a phone to the target word in word repetition [10].Words and syllables that follow the most frequent grapheme-to-phoneme correspondences (e.g., <hit> /hɪt/ in English) are considered regular, and they are usually not problematic for L2 learners.In this case, orthographic input can be a visual support to auditory input and help the learner to both perceive and produce the word correctly [3,6].For example, Japanese learners of English have been argued to be able to pronounce [l] and [r] correctly also when they do not perceive the distinction, if they know whether the word is spelled with <l> or <r> [9].Irregular words do not follow the usual correspondences, which is why it is harder to learn their correct pronunciation.
Sometimes the effect of orthography is so strong that it overrides an L2 learner's ability to hear the actual pronunciation of the word, thus preventing him/her from learning pronunciation accurately [10,11,12].Further, it seems that exposure to orthographic input alone leads more often to nontarget like pronunciation than exposure to auditory input.This is probably one reason for length of residence (LOR) in an L2 environment seeming more beneficial for learning L2 pronunciation than formal instruction, which is often based on or supported by written forms of language [13,14,15].Adult multilingual learners can use previous knowledge from other languages to find out links between orthography and pronunciation [16].However, it is difficult to teach pronunciation from spelling alone.
In the present paper, we study the impact of L2 orthography on L2 pronunciation in the very first stages of L2 learning, more specifically when the learner has no prior knowledge of the target language; a naïve learner.Since earlier studies on the effect of orthographic input on L2 pronunciation have focused on L2 learners with different experience of and exposure to the target language, the naïve learner's perspective is a novelty of our study.The learner's L1 is Swedish, and the target languages are Finnish and Portuguese.The target languages were chosen based on their degree of orthographic depth.In Finnish, the orthography is nearly transparent [2,6], while Portuguese has clearly a more opaque orthography [2].
Further, we study what kind of an effect a short auditory training (supported by simultaneous orthographic input) has on L2 pronunciation in languages with different orthographic depth.

Aim and research questions
The first aim of the study is to augment our understanding of the impact of orthography on pronunciation in the first stages of L2 learning.Secondly, we study what kind of an effect a short auditory training has on L2 pronunciation in languages with different orthographic depth.We address the following research questions: 1. Is an L2 with a transparent orthography easier to pronounce for a naïve learner than an L2 with an opaque orthography?2. What kind of effect does a short auditory training have (supported by simultaneous orthographic input) on comprehensibility and accuracy of L2 speech in languages with different orthographic depth?

Method and Material
A professional speech impersonator (a middle-aged male) with Swedish as his L1 read a short text in Finnish and Portuguese.Both languages were previously unfamiliar to him and had different degrees of orthographic depth.First, the impersonator read aloud a passage (ca.150 words) from a novel without any training or instructions how to pronounce the text.Next, he listened to the same text read by a native speaker of the language.He was requested to listen to, mimic and train, andwhen he thought he was ready for the taskread aloud the same texts again.Thus, he also had the text available during the auditory training and the second reading.The training session before the second reading was about two hours in both languages.All recordings were done in the impersonator's own audio studio, sent to the researchers by a secure internet link, stored on external disks, and only available for the researchers.We assume that the impersonator's L1 Swedish affects his pronunciation especially in the first recordings because he has no knowledge of the target languages (although he might recognize them).Thus, he has to rely at least to some degree on his L1.In the second recording he has gained some knowledge of the orthography-pronunciation correspondences and discrepancies in the target languages.These hypotheses indicate that comprehensibility should be better and the pronunciation more accurate in the second recordings.

Listener test
A listener test with pre-and post-training recordings (hereafter Test 1 and Test 2) was constructed.Native speakers of Finnish and Portuguese rated comprehensibility of the reading on a scale from 1 to 6.The scale was described with the following wordings: 1 = I understand nothing, 2 = I understand a couple of words, 3 = I understand quite little, 4 = I understand quite much, 5 = I understand almost everything, 6 = I understand everything.Further, the listeners graded the accuracy of pronunciation in the recordings by answering (1) whether Test 1 or Test 2 sounded better ('better' in the sense 'closer to the target language'; a forced choice with 4 alternatives), and ( 2) what pronunciation features caused the possible difference between the tests (an open question).The listener test was done with an online survey tool.The listeners were able to listen to the speech samples as many times as they wanted and advance in the test at their own pace.All in all, the test took ca.15 minutes to complete.The listeners had Finnish (n = 28) and Portuguese (n = 30) as their L1.The listener selection was based on two criteria: selfreported L1 and a minimum age of 18 years; i.e., the listeners consisted of a group of adult L1 listeners of the language.
Statistical significances were calculated by (1) Wilcoxon Signed-Rank Test for the difference between Test 1 and Test 2 within the same language, (2) Mann-Whitney U for the difference between the languages in both Test 1 and Test 2, respectively, and (3) paired samples t-test for the difference between Test 1 and Test 2 in all data.

Results
First, we present comprehensibility as rated by the L1 listeners, and thereafter the accuracy of pronunciation as rated by the L1 listeners.

Comprehensibility as rated by L1 listeners
On average, the listeners understood "quite little" (3) or "quite much" (4) (on a scale from 1 to 6) in both languages in Test 1.The Finnish listeners rated the comprehensibility as somewhat lower than the Portuguese listeners.The mean score was 3.50 in Finnish and 3.93 in Portuguese in Test 1 (a nonsignificant difference, p=.070, cf.Table 1).The results show that listeners understood somethingin many cases quite muchof both languages in Test 1.The finding is in line with human listeners' ability to interpret even fairly distorted (in the sense of not target language like) speech signals.In Test 2, the difference between the languages was smaller than in Test 1; the mean score was 4.21 in Finnish and 4.43 in Portuguese (a non-significant difference, p=.376, cf.Table 1).Thus, a short auditory training (supported by simultaneous orthographic input) improved comprehensibility of the L2 pronunciation evidently in both languages (Table 2).As regards comprehensibility, we can conclude that (1) on average, the listeners understand quite little or quite much of both languages in Test 1, (2) the languages were equally easy or difficult to pronounce for the naïve L2 speaker, (3) both languages got significantly higher comprehensibility ratings in Test 2 than in Test 1, and (4) the effect of a short auditory training (supported by simultaneous orthographic input) was similar and evident in both languages.

The accuracy of pronunciation as rated by L1 listeners
The listeners compared the accuracy of pronunciation between Test 1 and Test 2 in both languages.They were asked to answer to a forced choice question with four alternatives: Test 1 sounded better, the tests were equal (= they sounded equally good or bad), Test 2 sounded better or Test 2 sounded much better.In this context, the word 'better' meant 'closer to the target language', hence the term accuracy.In addition, the listeners were asked (not forced) to comment on what pronunciation features caused the possible difference between the tests.The answers were given a numeric value in the analysis: negative development (= 0); no changes, the tests were equal (= 1); Test 2 was better (= 2); Test 2 was much better (= 3).The results reveal that Test 2 was considered better or much better in both languages.Portuguese underwent the most positive development (cf.Table 3).Listeners' comments on what pronunciation features the development was caused by are presented in Figure 1.Even though Finnish and Portuguese have considerable phonetic differences, the pattern of development in the accuracy of pronunciation was similar in the two languages: Prosodic features, especially rhythm (the term was used by the listeners; we don't know exactly what they meant by it), were highly valued by the listeners in both languages (cf. Figure 1).Quite many Finnish listeners also mentioned a better pronunciation of segments in Test 2, while Portuguese listeners mentioned intonation as a feature that was better in Test 2 (cf. Figure 1).
Prosodic features, especially rhythm, have been shown to correlate most with comprehensibility in earlier studies on L2 pronunciation in many different languages [17,18,19,20].Thus, it is perhaps not surprising that the development mainly concerned prosodic features also in the present study.The listeners might have paid more attention to segmental features if the speaker had been more advanced and his speech easier to understand.
Concerning the accuracy of pronunciation, we can conclude that there was a significant development between the two tests in both languages.We also found a clear connection between the accuracy of pronunciation and comprehensibility: both underwent an evident improvement from Test 1 to Test 2.

Summary
A first finding of our study is that an L2 with a transparent orthography is not easier to pronounce for a naïve L2 learner than an L2 with an opaque orthography.This seems to be the case least in the very first stages of language learning when the L2 learner has not yet gained knowledge of the orthography-pronunciation correspondences and discrepancies in the L2.This result is an answer to our first research question.
A second finding of our study, answering the second research question, is that even a short period of auditory training (supported by simultaneous orthographic input) can induce a significant improvement in both comprehensibility and accuracy of L2 speech.This improvement was similar in the languages despite the difference in orthographic depth between them.
Previous studies have suggested that transparent orthography is beneficial for L2 pronunciation learning [3,10].This is probably true in later stages of L2 learning than was the case in the present study.Yet, we found it questionable how positive an effect orthographic input can have on L2 pronunciation learningespecially the accuracy of pronunciation (= the accentedness)in any language, because the phonetic realization differs from the orthography in tens of small but important details also in languages with a transparent orthography.For example, in Finnish important prosodic features as sentence stress, intonation, proportional syllable durations within words and phrases and reduction are not visible in the orthography.
The second finding of our studythat even a short period of auditory training can induce improvement in both the comprehensibility and accuracy of L2 speechseems to be the case at least when the L2 speaker is skillful in mimicking an L1 speaker.In our study, the impersonator could concentrate on just mimicking without the cognitive burden of producing grammatical and pragmatically functional speech [21], and he had one specific L1 speaker as his pronunciation model.Even with ordinary L2 speakers, mimicking one specific native speaker could be an effective strategy for pronunciation learning.This method should be studied in more detail in future research.Further, the results suggest that mimicking an L1 speaker could be effective especially when learning the important prosodic features.Prosodic features can be seen as the larger, more general characteristics of a language than segments, which is why they might be easier to mimic than segmental features.Prosodic features may also be more salient for the naïve listener for the same reason.

Acknowledgements
A special thanks to the speech impersonator.This study is part of the research project Fokus på uttalsinlärningen med svenska som mål-och

Figure 1 :
Figure 1: Listeners' comments (n = 58) on what pronunciation features the development in Test 2 concerned.The number of individual comments is given on the y-axis, the commented features on the x-axis.Regarding Prosody, the prosodic features Rhythm, Intonation and Tempo 1 are given separately in the figure.Further, Clarity is given as own feature, because the comments on Clarity concerned overall clarity of the speech, not prosody or segments.

Table 1 :
Pre-training (1) and post-training (2) mean, standard deviation (SD) for comprehensibility in the two languages and significance differences (Wilcoxon Signed-Rank Test) between the languages are given in the table.

Table 3 :
Development in accuracy of pronunciation as rated by the listeners (negative development = 0, no development = 1, Test 2 was better = 2, Test 2 was much better = 3).