DSpace at KOASAS: (A) model of masking as a front-end for the robust speech recognition

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Theses_Ph.D.(박사논문)

(A) model of masking as a front-end for the robust speech recognition잡음 둔감한 음성 인식을 위한 마스킹 모델

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 493
Download : 0

Export

Park, Ki-Young / 박기영

Nowadays automatic speech recognition (ASR) is emerging as one of the most promising technologies in near future. One of key challenges in ASR research is the sensitivity of ASR systems to the acoustic interferences like noise and reverberation. In this dissertation, the masking effect which is observed in human auditory perception, is utilized to make noise robust ASR systems. Masking is the process by which the threshold of audibility for one sound is raised by the presence of another sounds, and it is believed to enhance hearing resolution by cutting off redundant signals. The biological evidences for two kinds of masking, frequency masking and temporal masking, are exploited to model the masking effects and both types of masking are implemented with the conventional speech recognition systems. For further improvements of performance, the engineering approaches are introduced with the frequency and time domain filters. Frequency masking is modeled by the lateral inhibition in frequency domain and temporal masking by the unilateral inhibition in time domain. The parameters for the filters which determine the amount and range of inhibition, are searched on the basis of recognition performance with isolated-word recognition tasks. The proposed models are incorporated with the conventional feature extraction methods, including Mel-frequency cepstral coefficients (MFCC) model and zero-crossing peak-amplitude (ZCPA) model. MFCC model is well cooperated with the proposed model of frequency masking and ZCPA model has the built-in property of frequency masking. Temporal masking is applied to both model in the same way. The recognition performance with the proposed model of masking shows superior performance and it is also computationally efficient. For further improvement of performance, two additional methods are used with the proposed model. The spectral subtraction, which is conventional method widely used, shows the much more improvement when used wi...

Advisors: Lee, Soo-Young researcher; 이수영 researcher

Description: 한국과학기술원 : 전기및전자공학전공,

Publisher: 한국과학기술원

Issue Date: 2003

Identifier: 231120/325007 / 000995133

Language: eng

Description: 학위논문(박사) - 한국과학기술원 : 전기및전자공학전공, 2003.8, [ viii, 90 p. ]

Keywords: feature extraction; speech recognition; masking model; auditory model; 청각기관모델; 특징추출; 음성인식; 마스킹 모델

URI: http://hdl.handle.net/10203/35170

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=231120&flag=dissertation

Appears in Collection: EE-Theses_Ph.D.(박사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

(A) model of masking as a front-end for the robust speech recognition잡음 둔감한 음성 인식을 위한 마스킹 모델

KOASAS

Communities & Collections