DSpace at KOASAS: HMM-based Korean speech synthesizer with two-band mixed excitation model for enbedded applications

DSpace at KOASAS

College of Engineering(공과대학)KAIST-ICC School of Engineering-Theses_Ph.D(공학부 박사논문)

HMM-based Korean speech synthesizer with two-band mixed excitation model for enbedded applications임베디드 시스템을 위한 2대역 혼합 여기 모델과 은닉 마코프 모델(HMM)기반의 한국어 음성 합성기

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 569
Download : 0

Export

Kim, Sang-Jin / 김상진

Speech interface may be the first choice as a user interface for robots or hand-held devices such as personal digital assistants (PDAs) and portable multimedia players (PMPs). However, those devices have the limitation of the memory space and the computation power. The hidden Markov model (HMM)-based speech synthesis is presently considered to be suitable for the embedded systems. This thesis describes an HMM-based Korean speech synthesis, spectral parameter comparison, and the suggested two-band excitation model for the HMM-based speech synthesis. Firstly, development of an HMM-based Korean speech synthesis system and its evaluation is presented. Statistical HMM models for Korean speech units are trained with the hand-labeled speech database including the contextual information about phoneme, word phrase, utterance, and break strength. The developed system produced speech with a fairly good prosody. The synthesized speech is evaluated and compared with that of a corpus-based unit concatenating Korean text-to-speech system. The two systems were trained with the same manually labeled speech database. Secondly, comparison of the mel-cepstrum and the line spectrum pair (LSP) as the spectrum parameters for the developed HMM-based speech synthesis is described. Since the mel-cepstral analysis has a couple of merits compared to the linear prediction analysis and the normal cepstral analysis, mel-cepstral coefficients have been used as the feature for spectrum modeling of the HMM-based speech synthesis. Although the LSP also satisfies the stability and the quantization/ interpolation performance of the synthesis filter, its feasibility for the HMM-based speech synthesis is not tested. In this thesis, the LSP and the mel-cepstrum parameters are tested for the HMM-based speech synthesis, and the comparative performance evaluation is carried out. The two systems are trained with the same manually labeled speech database. The results show that the LSP can be a go...

Advisors: Hahn, Min-Soo researcher; 한민수 researcher

Description: 한국정보통신대학교 : 공학부,

Publisher: 한국정보통신대학교

Issue Date: 2007

Identifier: 392804/225023 / 020025338

Language: eng

Description: 학위논문(박사) - 한국정보통신대학교 : 공학부, 2007.2, [ xiii, 102 p. ]

Keywords: speech synthesis; HMM; excitation model; 여기 모델; 음성 합성; 은닉 마코프 모델; maximum voiced frequency

URI: http://hdl.handle.net/10203/54582

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=392804&flag=dissertation

Appears in Collection: School of Engineering-Theses_Ph.D(공학부 박사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

HMM-based Korean speech synthesizer with two-band mixed excitation model for enbedded applications임베디드 시스템을 위한 2대역 혼합 여기 모델과 은닉 마코프 모델(HMM)기반의 한국어 음성 합성기

KOASAS

Communities & Collections