Acoustic feature compensation by class-based histogram equalization for robust speech recognition강인한 음성인식을 위한 클래스 기반 히스토그램 등화 기법에 의한 음향특징 보상 방법

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 428
  • Download : 0
In this dissertation, we propose class-based histogram equalization (CHEQ) methods to compensate noisy acoustic features for robust speech recognition. The proposed methods aim at not only compensating for the acoustic mismatch between the training and test speech recognition environments, but also reducing the two fundamental limitations of conventional histogram equalization (HEQ). In contrast to conventional HEQ which uses global reference and test cumulative distribution functions (CDFs), the proposed methods utilize a number of class-specific reference and test CDFs, classify noisy test features into their corresponding classes, and equalize the features by using their class-specific reference and test distributions. According to the methods for utilizing class information, the proposed methods are classified into two forms: hard-CHEQ based on the vector quantization and soft-CHEQ based on the Gaussian mixture model (GMM). A class-tying technique is incorporated into both of the CHEQ methods to improve classification accuracy as well as to circumvent the sparse-data problem entailed in the class-based approach. Finally, CHEQ in combination with a minimum mean-square error log-spectral amplitude (MMSE-LSA) estimator is used for further performance improvement. Experiments on the Aurora-2 database confirmed the effectiveness of the CHEQ methods. The hard and soft-CHEQ methods produce overall average error reductions of 60.13% and 61.19% over the mel-frequency cepstral coefficient (MFCC)-based baseline features and those of 17.50% and 19.68% over conventional HEQ. In addition, the hard and soft-CHEQ methods with the class-tying technique provide additional improvement of 2.55% and 2.78% compared to the untied CHEQ methods, which result in overall average error reductions of 61.15% and 62.27% over MFCC-based baseline features and those of 19.61% and 21.92% over conventional HEQ, respectively. A combination of MMSE-LSA with CHEQ yields marginal performance impr...
Advisors
Kim, Hoi-Rinresearcher김회린researcher
Description
한국정보통신대학교 : 공학부,
Publisher
한국정보통신대학교
Issue Date
2006
Identifier
392695/225023 / 020025901
Language
eng
Description

학위논문(박사) - 한국정보통신대학교 : 공학부, 2006.8, [ viii, 124 p. ]

Keywords

Feature compensation; Class-based histogram equalization; Acoustic mismatch; Robust speech recognition; 강인한 음성인식; 특징 보상; 클래스 기반 히스토그램 등화; 음향 불일치

URI
http://hdl.handle.net/10203/54570
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=392695&flag=dissertation
Appears in Collection
School of Engineering-Theses_Ph.D(공학부 박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0