DSpace at KOASAS: Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Conference Papers(학술회의논문)

Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning

Cited 54 time in

Cited 37 time in

Hit : 178
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Kim, Dong-Jin	ko
dc.contributor.author	Choi, Jinsoo	ko
dc.contributor.author	Oh, Tae-Hyun	ko
dc.contributor.author	Kweon, In-So	ko
dc.date.accessioned	2019-11-28T08:26:22Z	-
dc.date.available	2019-11-28T08:26:22Z	-
dc.date.created	2019-11-26	-
dc.date.created	2019-11-26	-
dc.date.created	2019-11-26	-
dc.date.issued	2019-06-19	-
dc.identifier.citation	IEEE Conference on Computer Vision and Pattern Recognition, pp.6264 - 6273	-
dc.identifier.uri	http://hdl.handle.net/10203/268690	-
dc.description.abstract	Our goal in this work is to train an image captioning model that generates more dense and informative captions. We introduce "relational captioning," a novel image captioning task which aims to generate multiple captions with respect to relational information between objects in an image. Relational captioning is a framework that is advantageous in both diversity and amount of information, leading to image understanding based on relationships. Part-of speech (POS, i.e. subject-object-predicate categories) tags can be assigned to every English word. We leverage the POS as a prior to guide the correct sequence of words in a caption. To this end, we propose a multi-task triple-stream network (MTTSNet) which consists of three recurrent units for the respective POS and jointly performs POS prediction and captioning. We demonstrate more diverse and richer representations generated by the proposed model against several baselines and competing methods.	-
dc.language	English	-
dc.publisher	IEEE Conference on Computer Vision and Pattern Recognition	-
dc.title	Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning	-
dc.type	Conference	-
dc.identifier.wosid	000529484006047	-
dc.identifier.scopusid	2-s2.0-85066479484	-
dc.type.rims	CONF	-
dc.citation.beginningpage	6264	-
dc.citation.endingpage	6273	-
dc.citation.publicationname	IEEE Conference on Computer Vision and Pattern Recognition	-
dc.identifier.conferencecountry	US	-
dc.identifier.conferencelocation	Long Beach, CA	-
dc.identifier.doi	10.1109/CVPR.2019.00643	-
dc.contributor.localauthor	Kweon, In-So	-
dc.contributor.nonIdAuthor	Kim, Dong-Jin	-

Appears in Collection: EE-Conference Papers(학술회의논문)

Files in This Item: There are no files associated with this item.

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 54 items in WoS	Click to see citing articles in

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning

This item is cited by other documents in WoS

KOASAS

Communities & Collections