DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Lee, Soo-Young | - |
dc.contributor.advisor | 이수영 | - |
dc.contributor.author | Lee, Chang-Hoon | - |
dc.contributor.author | 이창훈 | - |
dc.date.accessioned | 2011-12-14 | - |
dc.date.available | 2011-12-14 | - |
dc.date.issued | 2006 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=258130&flag=dissertation | - |
dc.identifier.uri | http://hdl.handle.net/10203/36062 | - |
dc.description | 학위논문(박사) - 한국과학기술원 : 전기및전자공학전공, 2006.8, [ x, 83 p. ] | - |
dc.description.abstract | A top-down selective attention model which is brought from psychological researches is proposed to recognize isolated word in noisy environments. This model is applied to a hidden Markov model (HMM) classifier which is widely used for automatic speech recognition. An attention filter is introduced in the output of the Mel-filterbank, whose shapes are similar to cochlear filterbank where human attention might be processed. The attention filter is adapted by changing its gain in order to maximize the log likelihood of an attended testing input speech. However, while the log likelihood of the attended input to the selected model increases, any input signal can be attended to any model, then, the attention process produces over-fitted attended data. A low-complexity constraint was proposed to prevent the attention filter from over-fitting. The first method utilizes bilinear kernels which map attention filter to the lower resolution subspace to reduce the complexity of the attention filter effectively. The experiments were done with different sizes of grid with different level of white Gaussian noise. The recognition results are improved. The false recognition rates are 41% and 54% with 20dB SNR and 15dB SNR, respectively. However, the attention filter with bilinear kernels is restricted to model attention in some cases since the peak values in attention filter can be oriented at the grid position. So the model have to have the mechanism to find proper center position and width of the receptive field. Another candidate to reduce the complexity of an attention filter utilizes Gaussian kernels which are adapted not only weights but also the position of the center and the width of the receptive field. The attention filter with Gaussian kernel is adapted by gradient methods. The false recognition rates of this attention filter are 36% and 46% decrease in 20dB SNR and 15dB SNR, respectively. Although The bilinear model shows better p... | eng |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.subject | speech recognition | - |
dc.subject | HMM | - |
dc.subject | selective attention | - |
dc.subject | low resolution constraint | - |
dc.subject | 확신척도 | - |
dc.subject | 저해상도 제한 | - |
dc.subject | 음성인식 | - |
dc.subject | 선택적 주의집중 | - |
dc.subject | confidence measure | - |
dc.title | Noise robust speech recognition using kernel-based top-down selective attention | - |
dc.title.alternative | 커널 기반 하향식 주의집중 모델을 이용한 잡음에 강인한 음성인식 | - |
dc.type | Thesis(Ph.D) | - |
dc.identifier.CNRN | 258130/325007 | - |
dc.description.department | 한국과학기술원 : 전기및전자공학전공, | - |
dc.identifier.uid | 020015227 | - |
dc.contributor.localauthor | Lee, Soo-Young | - |
dc.contributor.localauthor | 이수영 | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.