DSpace at KOASAS: Scalable Neural Video Representations with Learnable Positional Features

DSpace at KOASAS

College of Engineering(공과대학)Kim Jaechul Graduate School of AI(김재철AI대학원)AI-Conference Papers(학술대회논문)

Scalable Neural Video Representations with Learnable Positional Features

Cited 0 time in webofscience

Cited 0 time in

Hit : 44
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Kim, Subin	ko
dc.contributor.author	Yu, Sihyun	ko
dc.contributor.author	Lee, Jaeho	ko
dc.contributor.author	Shin, Jinwoo	ko
dc.date.accessioned	2023-09-13T07:00:47Z	-
dc.date.available	2023-09-13T07:00:47Z	-
dc.date.created	2023-09-13	-
dc.date.issued	2022-11	-
dc.identifier.citation	36th Conference on Neural Information Processing Systems, NeurIPS 2022	-
dc.identifier.uri	http://hdl.handle.net/10203/312583	-
dc.description.abstract	Succinct representation of complex signals using coordinate-based neural representations (CNRs) has seen great progress, and several recent efforts focus on extending them for handling videos. Here, the main challenge is how to (a) alleviate a compute-inefficiency in training CNRs to (b) achieve high-quality video encoding while (c) maintaining the parameter-efficiency. To meet all requirements (a), (b), and (c) simultaneously, we propose neural video representations with learnable positional features (NVP), a novel CNR by introducing “learnable positional features” that effectively amortize a video as latent codes. Specifically, we first present a CNR architecture based on designing 2D latent keyframes to learn the common video contents across each spatio-temporal axis, which dramatically improves all of those three requirements. Then, we propose to utilize existing powerful image and video codecs as a compute-/memory-efficient compression procedure of latent codes. We demonstrate the superiority of NVP on the popular UVG benchmark; compared with prior arts, NVP not only trains 2 times faster (less than 5 minutes) but also exceeds their encoding quality as 34.07→34.57 (measured with the PSNR metric), even using >8 times fewer parameters. We also show intriguing properties of NVP, e.g., video inpainting, video frame interpolation, etc.	-
dc.language	English	-
dc.publisher	Neural information processing systems foundation	-
dc.title	Scalable Neural Video Representations with Learnable Positional Features	-
dc.type	Conference	-
dc.identifier.scopusid	2-s2.0-85162807450	-
dc.type.rims	CONF	-
dc.citation.publicationname	36th Conference on Neural Information Processing Systems, NeurIPS 2022	-
dc.identifier.conferencecountry	US	-
dc.identifier.conferencelocation	New Orleans	-
dc.contributor.localauthor	Shin, Jinwoo	-
dc.contributor.nonIdAuthor	Kim, Subin	-
dc.contributor.nonIdAuthor	Lee, Jaeho	-

Appears in Collection: AI-Conference Papers(학술대회논문)

Files in This Item: There are no files associated with this item.

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Scalable Neural Video Representations with Learnable Positional Features

KOASAS

Communities & Collections