DSpace at KOASAS: Adaptation strategies based on error back-propagation for improved speech recognition

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Theses_Ph.D.(박사논문)

Adaptation strategies based on error back-propagation for improved speech recognition음성 인식 성능 향상을 위한 오차 역전파에 기반한 적응 기법에 대한 연구

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 1153
Download : 0

Export

Im, Jung-Hui / 임정희

The performance of automatic speech recognition system needs improvements under both clean and noisy environments. Adaptation scheme used error back-propagation is useful to achieve high recognition accuracy. We can consider this strategies with two aspects. One is the test phase and the other is the training phase for speech recognition. In the test phase, we present the selective attention scheme especially with audio-visual integration. And the unified training scheme is proposed in the training phase. Speech is inherently bimodal, relying on cues from the acoustic and visual speech modalities for perception and production. The McGurk effect demonstrates that when humans are presented with conflicting acoustic and visual stimuli, the perceived sound may not exist in either modality. This effect has formed the basis for modeling the complementary nature of acoustic and visual speech by encapsulating them into the relatively new research field of audio-visual speech recognition (AVSR). Especially, for acoustically noisy speech recognition a new algorithm is presented to integrate audio and visual information for better recognition performance. Actually human begins utilize the visual cues such as lips` movements in acoustically noisy environments in order to understand speech better. In order to improve the performance of audio-visual speech recognition systems many works have been reported. These efforts may be categorized into two approaches such as research on robust feature extraction espcially for visual signals and that on audio-visual integration. Here we are interested in the latter. The developed algorithm utilizes top-down selective attention, which is an important information processing module of human perception. Selective attention model which are brought from psychological researches is proposed to recognize isolated word in noisy environments. This model is applied to Hidden Markov Models (HMM) as classifiers. The selective atten...

Advisors: Lee, Soo-Young researcher; 이수영

Description: 한국과학기술원 : 전기 및 전자공학과,

Publisher: 한국과학기술원

Issue Date: 2012

Identifier: 486649/325007 / 020015239

Language: eng

Description: 학위논문(박사) - 한국과학기술원 : 전기 및 전자공학과, 2012.2, [ ix, 107 p. ]

Keywords: Audio-visual integration; Selective Attention Model; McGurk effect; Unified training; 음성 영상 언어 통합; 주의 집중 모델; 맥걸트 효과; 통합 합습; HMM; HMM

URI: http://hdl.handle.net/10203/180226

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=486649&flag=dissertation

Appears in Collection: EE-Theses_Ph.D.(박사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Adaptation strategies based on error back-propagation for improved speech recognition음성 인식 성능 향상을 위한 오차 역전파에 기반한 적응 기법에 대한 연구

KOASAS

Communities & Collections