Statistical feature compensation and normalization for speech recognition in noisy environments잡음환경에서의 음성인식을 위한 통계적 특징 보상 및 정규화

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 339
  • Download : 0
The statistical mismatch of the speech feature between the training environment and the testing environment results in performance degradation of cepstrum based speech recognition system. In this dissertation work, we propose a new feature compensation method and two novel feature normalization algorithms based on statistical methods. When speech signals are contaminated by additive noise, the statistical properties of a speech feature vector vary according to the types of noise and signal-to-noise ratio (SNR) levels. The mutivaRiate-gAussian-based cepsTral normaliZation (RATZ) is one of the best known EM-based feature compensation method. However, the noisy model of the RATZ only represents the mean shift of feature vector. We propose a new noisy model for the RATZ which represents the variance deviation, as well as the mean shift. For the feature normalization algorithm, conventional methods only normalize the mean and/or variance of the cepstrum features. However, deviations of higher order moments also exist in noisy speech features. In order to fully normalize the variations of the statistical properties under noisy conditions, all the moments or the probability density functions (pdf) must be normalized. As the first step to full normalization, we propose cepstrum third-order normalization (CTN) method, which normalizes the third-order moment of the cepstrum as well as mean and variance. Moreover, we propose cepstrum pdf normalization (CPN) method, which fully normalizes the statistical properties. In order to consider various densities, the generalized Gaussian distribution (GGD) is used as the target pdf. A table lookup method is also used to alleviate the computational load of the CPN. From the speaker-independent word recognition experiments, we show that the propose methods give improved performance compared with that of the conventional methods, especially in heavy noise environments.
Advisors
Lee, Hwang-Sooresearcher이황수researcher
Description
한국과학기술원 : 전기및전자공학전공,
Publisher
한국과학기술원
Issue Date
2000
Identifier
157626/325007 / 000949029
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전기및전자공학전공, 2000.2, [ 104 p. ]

Keywords

speech recognition; feature normalization; noisy robust; cepstrum; feature compensation

URI
http://hdl.handle.net/10203/35832
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=157626&flag=dissertation
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0