DC Field | Value | Language |
---|---|---|
dc.contributor.author | Kim, Subin | ko |
dc.contributor.author | Yu, Sihyun | ko |
dc.contributor.author | Lee, Jaeho | ko |
dc.contributor.author | Shin, Jinwoo | ko |
dc.date.accessioned | 2023-09-13T07:00:47Z | - |
dc.date.available | 2023-09-13T07:00:47Z | - |
dc.date.created | 2023-09-13 | - |
dc.date.issued | 2022-11 | - |
dc.identifier.citation | 36th Conference on Neural Information Processing Systems, NeurIPS 2022 | - |
dc.identifier.uri | http://hdl.handle.net/10203/312583 | - |
dc.description.abstract | Succinct representation of complex signals using coordinate-based neural representations (CNRs) has seen great progress, and several recent efforts focus on extending them for handling videos. Here, the main challenge is how to (a) alleviate a compute-inefficiency in training CNRs to (b) achieve high-quality video encoding while (c) maintaining the parameter-efficiency. To meet all requirements (a), (b), and (c) simultaneously, we propose neural video representations with learnable positional features (NVP), a novel CNR by introducing “learnable positional features” that effectively amortize a video as latent codes. Specifically, we first present a CNR architecture based on designing 2D latent keyframes to learn the common video contents across each spatio-temporal axis, which dramatically improves all of those three requirements. Then, we propose to utilize existing powerful image and video codecs as a compute-/memory-efficient compression procedure of latent codes. We demonstrate the superiority of NVP on the popular UVG benchmark; compared with prior arts, NVP not only trains 2 times faster (less than 5 minutes) but also exceeds their encoding quality as 34.07→34.57 (measured with the PSNR metric), even using >8 times fewer parameters. We also show intriguing properties of NVP, e.g., video inpainting, video frame interpolation, etc. | - |
dc.language | English | - |
dc.publisher | Neural information processing systems foundation | - |
dc.title | Scalable Neural Video Representations with Learnable Positional Features | - |
dc.type | Conference | - |
dc.identifier.scopusid | 2-s2.0-85162807450 | - |
dc.type.rims | CONF | - |
dc.citation.publicationname | 36th Conference on Neural Information Processing Systems, NeurIPS 2022 | - |
dc.identifier.conferencecountry | US | - |
dc.identifier.conferencelocation | New Orleans | - |
dc.contributor.localauthor | Shin, Jinwoo | - |
dc.contributor.nonIdAuthor | Kim, Subin | - |
dc.contributor.nonIdAuthor | Lee, Jaeho | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.