DC Field | Value | Language |
---|---|---|
dc.contributor.author | Kim, Yeongbin | ko |
dc.contributor.author | Singh, Gautam | ko |
dc.contributor.author | Park, Junyeong | ko |
dc.contributor.author | Gulcehre, Caglar | ko |
dc.contributor.author | Ahn, Sungjin | ko |
dc.date.accessioned | 2023-11-30T01:02:29Z | - |
dc.date.available | 2023-11-30T01:02:29Z | - |
dc.date.created | 2023-11-09 | - |
dc.date.issued | 2023-12-14 | - |
dc.identifier.citation | The Thirty-seventh Conference on Neural Information Processing Systems, NeurIPS 2023 | - |
dc.identifier.uri | http://hdl.handle.net/10203/315450 | - |
dc.description.abstract | Systematic compositionality, or the ability to adapt to novel situations by creating a mental model of the world using reusable pieces of knowledge, remains a significant challenge in machine learning. While there has been considerable progress in the language domain, efforts towards systematic visual imagination, or envisioning the dynamical implications of a visual observation, are in their infancy. We introduce the Systematic Visual Imagination Benchmark (SVIB), the first benchmark designed to address this problem head-on. SVIB offers a novel framework for a minimal world modeling problem, where models are evaluated based on their ability to generate one-step image-to-image transformations under a latent world dynamics. The framework provides benefits such as the possibility to jointly optimize for systematic perception and imagination, a range of difficulty levels, and the ability to control the fraction of possible factor combinations used during training. We provide a comprehensive evaluation of various baseline models on SVIB, offering insight into the current state-of-the-art in systematic visual imagination. We hope that this benchmark will help advance visual systematic compositionality. | - |
dc.language | English | - |
dc.publisher | The Conference on Neural Information Processing Systems | - |
dc.title | Imagine the Unseen World: A Benchmark for Systematic Generalization in Visual World Models | - |
dc.type | Conference | - |
dc.type.rims | CONF | - |
dc.citation.publicationname | The Thirty-seventh Conference on Neural Information Processing Systems, NeurIPS 2023 | - |
dc.identifier.conferencecountry | US | - |
dc.identifier.conferencelocation | New Orleans Ernest N. Morial Convention Center | - |
dc.contributor.localauthor | Ahn, Sungjin | - |
dc.contributor.nonIdAuthor | Kim, Yeongbin | - |
dc.contributor.nonIdAuthor | Singh, Gautam | - |
dc.contributor.nonIdAuthor | Park, Junyeong | - |
dc.contributor.nonIdAuthor | Gulcehre, Caglar | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.