Noise robust speech recognition using kernel-based top-down selective attention커널 기반 하향식 주의집중 모델을 이용한 잡음에 강인한 음성인식

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 475
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorLee, Soo-Young-
dc.contributor.advisor이수영-
dc.contributor.authorLee, Chang-Hoon-
dc.contributor.author이창훈-
dc.date.accessioned2011-12-14-
dc.date.available2011-12-14-
dc.date.issued2006-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=258130&flag=dissertation-
dc.identifier.urihttp://hdl.handle.net/10203/36062-
dc.description학위논문(박사) - 한국과학기술원 : 전기및전자공학전공, 2006.8, [ x, 83 p. ]-
dc.description.abstractA top-down selective attention model which is brought from psychological researches is proposed to recognize isolated word in noisy environments. This model is applied to a hidden Markov model (HMM) classifier which is widely used for automatic speech recognition. An attention filter is introduced in the output of the Mel-filterbank, whose shapes are similar to cochlear filterbank where human attention might be processed. The attention filter is adapted by changing its gain in order to maximize the log likelihood of an attended testing input speech. However, while the log likelihood of the attended input to the selected model increases, any input signal can be attended to any model, then, the attention process produces over-fitted attended data. A low-complexity constraint was proposed to prevent the attention filter from over-fitting. The first method utilizes bilinear kernels which map attention filter to the lower resolution subspace to reduce the complexity of the attention filter effectively. The experiments were done with different sizes of grid with different level of white Gaussian noise. The recognition results are improved. The false recognition rates are 41% and 54% with 20dB SNR and 15dB SNR, respectively. However, the attention filter with bilinear kernels is restricted to model attention in some cases since the peak values in attention filter can be oriented at the grid position. So the model have to have the mechanism to find proper center position and width of the receptive field. Another candidate to reduce the complexity of an attention filter utilizes Gaussian kernels which are adapted not only weights but also the position of the center and the width of the receptive field. The attention filter with Gaussian kernel is adapted by gradient methods. The false recognition rates of this attention filter are 36% and 46% decrease in 20dB SNR and 15dB SNR, respectively. Although The bilinear model shows better p...eng
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectspeech recognition-
dc.subjectHMM-
dc.subjectselective attention-
dc.subjectlow resolution constraint-
dc.subject확신척도-
dc.subject저해상도 제한-
dc.subject음성인식-
dc.subject선택적 주의집중-
dc.subjectconfidence measure-
dc.titleNoise robust speech recognition using kernel-based top-down selective attention-
dc.title.alternative커널 기반 하향식 주의집중 모델을 이용한 잡음에 강인한 음성인식-
dc.typeThesis(Ph.D)-
dc.identifier.CNRN258130/325007 -
dc.description.department한국과학기술원 : 전기및전자공학전공, -
dc.identifier.uid020015227-
dc.contributor.localauthorLee, Soo-Young-
dc.contributor.localauthor이수영-
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0