Emotional speech from machine
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Emotional speech is the expressiveness in speech that is transmitted through changes in pitch, loudness, timbre, speech rate and pauses that convey emotion. Although the current TTS technology is capable of converting a given text into speech, they sound monotonous and lack emotion and naturalness. In order to improve artificial voices, application of emotion is highly evaluated. In this thesis, we will be creating a system that makes use of speech mark-up language to produce emotion in speech by analysing the tone of given text. For this purpose, we combine IBM tone analyser with TTS that accepts the speech mark-up language. In this research, we perform empirical study on two experimental implementation using two TTS and two speech mark-up language. The first combination involves IBM TTS and SSML and the second combination includes MARY TTS and EmotionML. The mark-ups are predefined in EmotionML for four major emotions namely anger, fear, joy and sadness and for SSML prosody value from previous study is used. Therefore, this study describes the two implementations and evaluate their output emotional speech synthesis which is then compares with human voice to define its perfection. ...
MetadataShow full item record
- Pro gradu -tutkielmat 
Showing items with similar title or keywords.
Korolainen, Valtteri (2014)Erilaiset kieliteknologiasovellukset ovat olleet jo vuosikymmeniä arkipäiväises-sä käytössä. Esimerkiksi ennustava tekstinsyöttö ja automaattinen korjaus ovat olleet käytössä jo vuosikymmeniä. Puheen tunnistus ja kielen ...
Distinct Patterns of Functional Connectivity During the Comprehension of Natural, Narrative Speech Zhu, Yongjie; Liu, Jia; Ristaniemi, Tapani; Cong, Fengyu (World Scientific, 2020)Recent continuous task studies, such as narrative speech comprehension, show that fluctuations in brain functional connectivity (FC) are altered and enhanced compared to the resting state. Here, we characterized the ...
Coherence between brain activation and speech envelope at word and sentence levels showed age-related differences in low frequency bands Kolozsvári, Orsolya B; Xu, Weiyong; Gerike, Georgia; Parviainen, Tiina; Nieminen, Lea; Noiray, Aude; Hämäläinen, Jarmo A (MIT Press, 2021)Speech perception is dynamic and shows changes across development. In parallel, functional differences in brain development over time have been well documented and these differences may interact with changes in speech ...
Top-Down Predictions of Familiarity and Congruency in Audio-Visual Speech Perception at Neural Level Kolozsvári, Orsolya B.; Xu, Weiyong; Leppänen, Paavo H. T.; Hämäläinen, Jarmo A. (Frontiers Media, 2019)During speech perception, listeners rely on multimodal input and make use of both auditory and visual information. When presented with speech, for example syllables, the differences in brain responses to distinct stimuli ...
Gavriushenko, Mariia; Lindberg, Renny S. N.; Khriyenko, Oleksiy (IATED Academy, 2017)Personalized learning is increasingly gaining popularity, especially with the development of information technology and modern educational resources for learning. Each person is individual and has different knowledge ...