DSpace at KOASAS: C2C : context to context mapping with audio-knowledge for lip reading

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Theses_Master(석사논문)

C2C : context to context mapping with audio-knowledge for lip reading음성 지식을 활용한 문맥 정보 기반 독순술

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 89
Download : 0

Export

DC Field	Value	Language
dc.contributor.advisor	Ro, Yong Man	-
dc.contributor.advisor	노용만	-
dc.contributor.author	Yeo, Jeong Hun	-
dc.date.accessioned	2023-06-26T19:33:56Z	-
dc.date.available	2023-06-26T19:33:56Z	-
dc.date.issued	2022	-
dc.identifier.uri	http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1008353&flag=dissertation	en_US
dc.identifier.uri	http://hdl.handle.net/10203/309885	-
dc.description	학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2022.8,[iii, 22 p. :]	-
dc.description.abstract	Lip reading is to predict the spoken sentence from silent lip movement. However, due to the existence of homophenes that similar lip movement with different sound, lip reading is a challenging task and showing inferior performances than speech recognition. To mitigate the homophenes problem in lip reading, in this paper, we propose a novel Context to Context mapping (C2C) method which is mainly composed of two parts: 1) Audio Context Memory Network is designed to complement insufficient visual information by storing and providing both phoneme- and context-level audio knowledge without audio input during the inference phase, and 2) Visual Feature Decomposition Module (VFDM) is presented to figure out subtle differences in similar lip movements by decomposing visual features into multiple latent features in order to capture the different amounts of temporal information. And reconstructed visual feature from latent features can distinguish subtle difference of lip movement. which also be helpful to reconstruct audio knowledge in viseme to phoneme level due to discriminative visual feature. Through the extensive experiments, we validate the effectiveness of the proposed C2C method achieving state-of-the-art performances on two public word-level lip reading datasets.	-
dc.language	eng	-
dc.publisher	한국과학기술원	-
dc.subject	Lip Reading▼aVisual Speech Recognition▼aContext to Context Mapping▼aVisual Feature Decomposition	-
dc.subject	독순술▼a멀티모달 러닝▼a오디오-비주얼 문맥 정보 연결▼a메모리	-
dc.title	C2C	-
dc.title.alternative	음성 지식을 활용한 문맥 정보 기반 독순술	-
dc.type	Thesis(Master)	-
dc.identifier.CNRN	325007	-
dc.description.department	한국과학기술원 :전기및전자공학부,	-
dc.contributor.alternativeauthor	여정훈	-
dc.title.subtitle	context to context mapping with audio-knowledge for lip reading	-

Appears in Collection: EE-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

C2C : context to context mapping with audio-knowledge for lip reading음성 지식을 활용한 문맥 정보 기반 독순술

KOASAS

Communities & Collections