HMM-based Korean speech synthesis using suprasegmental prosodic features초분절적 운율 정보를 이용한 HMM 기반 한국어 음성 합성

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 568
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorOh, Yung-Hwan-
dc.contributor.advisor오영환-
dc.contributor.authorLee, Seung-Uk-
dc.contributor.author이승욱-
dc.date.accessioned2011-12-13T06:09:48Z-
dc.date.available2011-12-13T06:09:48Z-
dc.date.issued2011-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=467932&flag=dissertation-
dc.identifier.urihttp://hdl.handle.net/10203/34973-
dc.description학위논문(석사) - 한국과학기술원 : 전산학과, 2011.2, [ vi, 39 p. ]-
dc.description.abstractHidden Markov models (HMMs) are generally used to recent researches statistical parametric speech synthesis systems. An HMM is a generative model frequently used in speech recognition, which is applied to parameter generation that is prior stage to signal processing of speech synthesis. HMM-based speech synthesis has advantages including the followings: Much less storage is necessary because there is no need to keep speech corpus after training is finished. Furthermore, it is easy to get the speeches with various voice characteristics, speaking styles, and emotions by modifying the parameters. There are more advantages such as multilingual support, robustness, and ability to separately control each parameter. Commonly believed drawbacks of this kind of speech synthesis such as vocoder-like sound or unnaturalness due to speech reconstruction from parameters are being gradually overcome. However, most HMM-based speech synthesis approaches are inferior in the sense of prosody. Prosody is an important factor of verbal communication. There is a research insists that prosody has more eminent impact on communication than meaning of the words themselves. The primary weakness of HMM-based speech synthesis system in generation of prosody is that it considers prosodic features in subword units, i.e. phones. A model in the trained HMM set corresponds to a phone. Therefore, it has difficulties utilizing suprasegmental information such as relations between words, structure of the sentence, and lengths of each word, phrase, and the sentence. This leads the HMM-based system to lack the capability for creating natural speeches with human-like changes in pitch and tempo, rather it creates machine-like speeches which have the same pauses at spaces, pronounce all the words in the same way without any changes in strength or rate. Context-dependent HMM is suggested to overcome this problem; still it has not been the essential solution for prosody. We have researched generating more ...eng
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectprosody-
dc.subjectCART-
dc.subjectHMM-
dc.subjectSpeech synthesis-
dc.subjectKorean-
dc.subject한국어-
dc.subject운율-
dc.subject분류 및 회귀 트리-
dc.subject은닉 마르코프 모델-
dc.subject음성 합성-
dc.titleHMM-based Korean speech synthesis using suprasegmental prosodic features-
dc.title.alternative초분절적 운율 정보를 이용한 HMM 기반 한국어 음성 합성-
dc.typeThesis(Master)-
dc.identifier.CNRN467932/325007 -
dc.description.department한국과학기술원 : 전산학과, -
dc.identifier.uid020083380-
dc.contributor.localauthorOh, Yung-Hwan-
dc.contributor.localauthor오영환-
Appears in Collection
CS-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0