DSpace at KOASAS: Statistical feature compensation and normalization for speech recognition in noisy environments

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Theses_Ph.D.(박사논문)

Statistical feature compensation and normalization for speech recognition in noisy environments잡음환경에서의 음성인식을 위한 통계적 특징 보상 및 정규화

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 355
Download : 0

Export

Suk, Yong-Ho / 석용호

The statistical mismatch of the speech feature between the training environment and the testing environment results in performance degradation of cepstrum based speech recognition system. In this dissertation work, we propose a new feature compensation method and two novel feature normalization algorithms based on statistical methods. When speech signals are contaminated by additive noise, the statistical properties of a speech feature vector vary according to the types of noise and signal-to-noise ratio (SNR) levels. The mutivaRiate-gAussian-based cepsTral normaliZation (RATZ) is one of the best known EM-based feature compensation method. However, the noisy model of the RATZ only represents the mean shift of feature vector. We propose a new noisy model for the RATZ which represents the variance deviation, as well as the mean shift. For the feature normalization algorithm, conventional methods only normalize the mean and/or variance of the cepstrum features. However, deviations of higher order moments also exist in noisy speech features. In order to fully normalize the variations of the statistical properties under noisy conditions, all the moments or the probability density functions (pdf) must be normalized. As the first step to full normalization, we propose cepstrum third-order normalization (CTN) method, which normalizes the third-order moment of the cepstrum as well as mean and variance. Moreover, we propose cepstrum pdf normalization (CPN) method, which fully normalizes the statistical properties. In order to consider various densities, the generalized Gaussian distribution (GGD) is used as the target pdf. A table lookup method is also used to alleviate the computational load of the CPN. From the speaker-independent word recognition experiments, we show that the propose methods give improved performance compared with that of the conventional methods, especially in heavy noise environments.

Advisors: Lee, Hwang-Soo researcher; 이황수 researcher

Description: 한국과학기술원 : 전기및전자공학전공,

Publisher: 한국과학기술원

Issue Date: 2000

Identifier: 157626/325007 / 000949029

Language: eng

Description: 학위논문(박사) - 한국과학기술원 : 전기및전자공학전공, 2000.2, [ 104 p. ]

Keywords: speech recognition; feature normalization; noisy robust; cepstrum; feature compensation

URI: http://hdl.handle.net/10203/35832

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=157626&flag=dissertation

Appears in Collection: EE-Theses_Ph.D.(박사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Statistical feature compensation and normalization for speech recognition in noisy environments잡음환경에서의 음성인식을 위한 통계적 특징 보상 및 정규화

KOASAS

Communities & Collections