DSpace at KOASAS: (A) study on the frequency - weighted spectral representations for robust speech recognition

DSpace at KOASAS

College of Engineering(공과대학)School of Computing(전산학부)CS-Theses_Ph.D.(박사논문)

(A) study on the frequency - weighted spectral representations for robust speech recognitionRobust한 음성인식을 위한 주파수 가중 스펙트럼 표현에 관한 연구

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 684
Download : 0

Export

Kim, Ki-Chul / 김기철

To use a speech recognition system in a practical environment, the speech recognizer should have the robustness with respect to unexpected changes in the acoustical environmint as well as the inherent variability of the speech signal. In this dissertation we propose and analyze robust spectral representations derived from short-time speech spectrum, which are less sensitive to the encironmental changes due to speakers or background noise, and yield smaller number of bits or low dimensional representation without performance degradation. By performing spectral peak enhancement to an all-pole model spectrum and integrating th resultant binarized spectrum with nonlinear frequency scale, several kinds of the frequency-weighted spectral representations were derived. The spectral peak was enhanced by thresholding the secondorder spatial derivative of the spectral envelope with respect to frequency to relax the sensitivity of distance measute to the peak amplitude as well as to enhance the spectral peak. The proposed representations are applied to nasal, stop, and isolated digit recognition in clean, white noise added, and band-limited conditions. To evaluate the recognition performance, we used an all-pole model based features, LPC cepstrum and mel-cepstrum, and filter bank based features with Euclidian and several weighted cepstral distance measures in a template-based isolated word recognition system. The proposed features showed imprevement in the recognition accuracy for the nasal-vowel syllables in clean, white Gaussian noise added, and band-limited conditions. While the performance of stop-vowel syllable recognition was decreased, the recognition result of isolated digits was competitive compared to the best conventional representation and distance measure, but the robustness and the represention efficiency were improved. Among the frequency-weighted spectral representations, critical-band peak presence vector(CBPPV) which consists of 17 bits representing peak ...

Advisors: Cho, Jung-Wan; 조정완

Description: 한국과학기술원 : 전산학과,

Publisher: 한국과학기술원

Issue Date: 1992

Identifier: 59805/325007 / 000835041

Language: eng

Description: 학위논문(박사) - 한국과학기술원 : 전산학과, 1992.2, [ [x], 144 p. ]

Keywords: 스펙트럼 표현

URI: http://hdl.handle.net/10203/32923

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=59805&flag=dissertation

Appears in Collection: CS-Theses_Ph.D.(박사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

(A) study on the frequency - weighted spectral representations for robust speech recognitionRobust한 음성인식을 위한 주파수 가중 스펙트럼 표현에 관한 연구

KOASAS

Communities & Collections