Adaptation strategies based on error back-propagation for improved speech recognition음성 인식 성능 향상을 위한 오차 역전파에 기반한 적응 기법에 대한 연구

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 1153
  • Download : 0
The performance of automatic speech recognition system needs improvements under both clean and noisy environments. Adaptation scheme used error back-propagation is useful to achieve high recognition accuracy. We can consider this strategies with two aspects. One is the test phase and the other is the training phase for speech recognition. In the test phase, we present the selective attention scheme especially with audio-visual integration. And the unified training scheme is proposed in the training phase. Speech is inherently bimodal, relying on cues from the acoustic and visual speech modalities for perception and production. The McGurk effect demonstrates that when humans are presented with conflicting acoustic and visual stimuli, the perceived sound may not exist in either modality. This effect has formed the basis for modeling the complementary nature of acoustic and visual speech by encapsulating them into the relatively new research field of audio-visual speech recognition (AVSR). Especially, for acoustically noisy speech recognition a new algorithm is presented to integrate audio and visual information for better recognition performance. Actually human begins utilize the visual cues such as lips` movements in acoustically noisy environments in order to understand speech better. In order to improve the performance of audio-visual speech recognition systems many works have been reported. These efforts may be categorized into two approaches such as research on robust feature extraction espcially for visual signals and that on audio-visual integration. Here we are interested in the latter. The developed algorithm utilizes top-down selective attention, which is an important information processing module of human perception. Selective attention model which are brought from psychological researches is proposed to recognize isolated word in noisy environments. This model is applied to Hidden Markov Models (HMM) as classifiers. The selective atten...
Advisors
Lee, Soo-Youngresearcher이수영
Description
한국과학기술원 : 전기 및 전자공학과,
Publisher
한국과학기술원
Issue Date
2012
Identifier
486649/325007  / 020015239
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전기 및 전자공학과, 2012.2, [ ix, 107 p. ]

Keywords

Audio-visual integration; Selective Attention Model; McGurk effect; Unified training; 음성 영상 언어 통합; 주의 집중 모델; 맥걸트 효과; 통합 합습; HMM; HMM

URI
http://hdl.handle.net/10203/180226
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=486649&flag=dissertation
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0