DSpace at KOASAS: LEARNING SOUND LOCALIZATION BETTER FROM SEMANTICALLY SIMILAR SAMPLES

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Conference Papers(학술회의논문)

LEARNING SOUND LOCALIZATION BETTER FROM SEMANTICALLY SIMILAR SAMPLES

Cited 9 time in

Cited 0 time in

Hit : 45
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Senocak, Arda	ko
dc.contributor.author	Ryu, Hyeonggon	ko
dc.contributor.author	Kim, Junsik	ko
dc.contributor.author	Kweon, In-So	ko
dc.date.accessioned	2022-11-17T07:00:24Z	-
dc.date.available	2022-11-17T07:00:24Z	-
dc.date.created	2022-09-27	-
dc.date.created	2022-09-27	-
dc.date.issued	2022-05	-
dc.identifier.citation	47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022, pp.4863 - 4867	-
dc.identifier.issn	1520-6149	-
dc.identifier.uri	http://hdl.handle.net/10203/299804	-
dc.description.abstract	The objective of this work is to localize the sound sources in visual scenes. Existing audio-visual works employ contrastive learning by assigning corresponding audio-visual pairs from the same source as positives while randomly mismatched pairs as negatives. However, these negative pairs may contain semantically matched audio-visual information. Thus, these semantically correlated pairs, “hard positives”, are mistakenly grouped as negatives. Our key contribution is showing that hard positives can give similar response maps to the corresponding pairs. Our approach incorporates these hard positives by adding their response maps into a contrastive learning objective directly. We demonstrate the effectiveness of our approach on VGG-SS and SoundNet-Flickr test sets, showing favorable performance to the state-of-the-art methods.	-
dc.language	English	-
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	-
dc.title	LEARNING SOUND LOCALIZATION BETTER FROM SEMANTICALLY SIMILAR SAMPLES	-
dc.type	Conference	-
dc.identifier.wosid	000864187905031	-
dc.identifier.scopusid	2-s2.0-85127066533	-
dc.type.rims	CONF	-
dc.citation.beginningpage	4863	-
dc.citation.endingpage	4867	-
dc.citation.publicationname	47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022	-
dc.identifier.conferencecountry	US	-
dc.identifier.conferencelocation	Virtual, Online	-
dc.identifier.doi	10.1109/ICASSP43922.2022.9747867	-
dc.contributor.localauthor	Kweon, In-So	-
dc.contributor.nonIdAuthor	Ryu, Hyeonggon	-
dc.contributor.nonIdAuthor	Kim, Junsik	-

Appears in Collection: EE-Conference Papers(학술회의논문)

Files in This Item: There are no files associated with this item.

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 9 items in WoS	Click to see citing articles in

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

LEARNING SOUND LOCALIZATION BETTER FROM SEMANTICALLY SIMILAR SAMPLES

This item is cited by other documents in WoS

KOASAS

Communities & Collections