DSpace at KOASAS: C2C : context to context mapping with audio-knowledge for lip reading

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Theses_Master(석사논문)

C2C : context to context mapping with audio-knowledge for lip reading음성 지식을 활용한 문맥 정보 기반 독순술

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 88
Download : 0

Export

Yeo, Jeong Hun

Lip reading is to predict the spoken sentence from silent lip movement. However, due to the existence of homophenes that similar lip movement with different sound, lip reading is a challenging task and showing inferior performances than speech recognition. To mitigate the homophenes problem in lip reading, in this paper, we propose a novel Context to Context mapping (C2C) method which is mainly composed of two parts: 1) Audio Context Memory Network is designed to complement insufficient visual information by storing and providing both phoneme- and context-level audio knowledge without audio input during the inference phase, and 2) Visual Feature Decomposition Module (VFDM) is presented to figure out subtle differences in similar lip movements by decomposing visual features into multiple latent features in order to capture the different amounts of temporal information. And reconstructed visual feature from latent features can distinguish subtle difference of lip movement. which also be helpful to reconstruct audio knowledge in viseme to phoneme level due to discriminative visual feature. Through the extensive experiments, we validate the effectiveness of the proposed C2C method achieving state-of-the-art performances on two public word-level lip reading datasets.

Advisors: Ro, Yong Man researcher; 노용만 researcher

Description: 한국과학기술원 :전기및전자공학부,

Publisher: 한국과학기술원

Issue Date: 2022

Identifier: 325007

Language: eng

Description: 학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2022.8,[iii, 22 p. :]

Keywords: Lip Reading▼aVisual Speech Recognition▼aContext to Context Mapping▼aVisual Feature Decomposition; 독순술▼a멀티모달 러닝▼a오디오-비주얼 문맥 정보 연결▼a메모리

URI: http://hdl.handle.net/10203/309885

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1008353&flag=dissertation

Appears in Collection: EE-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

C2C : context to context mapping with audio-knowledge for lip reading음성 지식을 활용한 문맥 정보 기반 독순술

KOASAS

Communities & Collections