DSpace at KOASAS: Robust speech recognition based on partial information technique

DSpace at KOASAS

College of Engineering(공과대학)School of Computing(전산학부)CS-Theses_Ph.D.(박사논문)

Robust speech recognition based on partial information technique부분 정보 기법에 기반한 강인한 음성인식

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 475
Download : 0

Export

Cho, Hoon-Young / 조훈영

Automatic speech recognition (ASR) systems in real environments may have to cope with various noise signals that corrupt some time-frequency regions of speech more severely than other regions. Though great progress has been achieved in the area of robust ASR, most techniques have focused on reasonably stationary wide-band noise and, therefore, are limited in their ability to achieve robustness in real noisy environments. Partial information technique, a developing area of research, models the human ability of emphasizing reliable partial information in time-frequency regions. As one of its main approaches, the multi-band ASR scheme splits the whole frequency range into several sub-bands. Sub-bands recognition results are then recombined by exploiting sub-band reliabilities to make a final decision. This dissertation aims at improving the ASR performance on partially corrupted speech based on the partial information technique. In the frequency domain, three limitations of the multi-band recognition system are addressed. First, the multi-band scheme cannot maximally utilize the uncorrupted parts because sub-band boundaries are fixed. The sub-band boundaries should be adaptive to localize the noise and better utilize partial information. Second, because sub-band feature vectors are processed independently in this method, the information contained in a global spectral structure may be lost. Finally, the whole ASR system must be rebuilt because of the architectural differences from a full-band ASR system. This study proposes a weighted filter bank analysis and model adaptation (WFBA-MA) method to resolve these problems. The proposed scheme estimates reliability weights of Mel filter bank channels and extracts a weighted Mel frequency cepstral coefficient by suppressing unreliable log filter bank energies. The same weights are also applied to an entire set of HMM parameters. An environment selective processing (ESP) method is also proposed, which determines whether a...

Advisors: Oh, Yung-Hwan researcher; 오영환 researcher

Description: 한국과학기술원 : 전산학전공,

Publisher: 한국과학기술원

Issue Date: 2003

Identifier: 181185/325007 / 000985369

Language: eng

Description: 학위논문(박사) - 한국과학기술원 : 전산학전공, 2003.2, [ ix, 88 p. ]

Keywords: Weighted filter bank analysis; Partially corrupted speech; Multi-band speech recognition; Partial Information; Model adaptation; 모델 적응; 가중 필터뱅크 분석; 부분 손상 음성; 다중대역 음성인식; 부분 정보

URI: http://hdl.handle.net/10203/32838

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=181185&flag=dissertation

Appears in Collection: CS-Theses_Ph.D.(박사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Robust speech recognition based on partial information technique부분 정보 기법에 기반한 강인한 음성인식

KOASAS

Communities & Collections