Personal prosody model based korean emotional speech synthesis = 개인 운율 모델 기반 한국어 감정 음성 합성

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 421
  • Download : 0
Speech is the most basic and widely used communication method for expressing thoughts during human-human interaction and has been studied for user-friendly interfaces between humans and machines. Recent progress in speech synthesis has produced artificial vocal results with very high intelligibility, but the quality of sound and the naturalness of inflection remain major issues. Today, in addition to the need for improvement in sound quality and naturalness, there is a growing need for a method for the generation of speech with emotions to provide the required information in a natural and effective way. For this purpose, various types of emotional expression are usually transcribed first into corresponding datasets, which are then used for the modeling of each type of emotional speech. This kind of massive dataset analysis technique has improved the performance of information providing services both quantitatively and qualitatively. In this dissertation, however, I argue that this approach does not work well with interactions that are based on personal experience such as emotional speech synthesis. We know empirically that individual speakers have their own ways of expressing emotions based on their personal experience, and that massive dataset management may easily overlook these personalized and relative differences. Therefore, this dissertation examines the emotional prosody structures of four basic emotions such as anger, fear, happiness, and sadness, by considering their personalized and relative differences. As a result, this dissertation addresses the tendency for the emotional prosody structures of pitch and speech rate to depend more on individual speakers (i.e. personal information) than intensity and pause length do. This personal information enables the modeling of relative differences of each emotional prosody structure (i.e. personal prosody model), the possibilities of which were dismissed earlier during the application of massive dataset analy...
Advisors
Park, Jong-C.researcher박종철researcher
Description
한국과학기술원 : 전산학과,
Publisher
한국과학기술원
Issue Date
2010
Identifier
455447/325007  / 020035868
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전산학과, 2010.08, [ x, 86 p. ]

Keywords

prosody modeling; korean emotional speech synthesis; natural language processing; personal model; 개인 모델; 운율 모델링; 한국어 감정 음성 합성; 자연언어처리

URI
http://hdl.handle.net/10203/33319
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=455447&flag=dissertation
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0