Imagine the Unseen World: A Benchmark for Systematic Generalization in Visual World Models

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 129
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorKim, Yeongbinko
dc.contributor.authorSingh, Gautamko
dc.contributor.authorPark, Junyeongko
dc.contributor.authorGulcehre, Caglarko
dc.contributor.authorAhn, Sungjinko
dc.date.accessioned2023-11-30T01:02:29Z-
dc.date.available2023-11-30T01:02:29Z-
dc.date.created2023-11-09-
dc.date.issued2023-12-14-
dc.identifier.citationThe Thirty-seventh Conference on Neural Information Processing Systems, NeurIPS 2023-
dc.identifier.urihttp://hdl.handle.net/10203/315450-
dc.description.abstractSystematic compositionality, or the ability to adapt to novel situations by creating a mental model of the world using reusable pieces of knowledge, remains a significant challenge in machine learning. While there has been considerable progress in the language domain, efforts towards systematic visual imagination, or envisioning the dynamical implications of a visual observation, are in their infancy. We introduce the Systematic Visual Imagination Benchmark (SVIB), the first benchmark designed to address this problem head-on. SVIB offers a novel framework for a minimal world modeling problem, where models are evaluated based on their ability to generate one-step image-to-image transformations under a latent world dynamics. The framework provides benefits such as the possibility to jointly optimize for systematic perception and imagination, a range of difficulty levels, and the ability to control the fraction of possible factor combinations used during training. We provide a comprehensive evaluation of various baseline models on SVIB, offering insight into the current state-of-the-art in systematic visual imagination. We hope that this benchmark will help advance visual systematic compositionality.-
dc.languageEnglish-
dc.publisherThe Conference on Neural Information Processing Systems-
dc.titleImagine the Unseen World: A Benchmark for Systematic Generalization in Visual World Models-
dc.typeConference-
dc.type.rimsCONF-
dc.citation.publicationnameThe Thirty-seventh Conference on Neural Information Processing Systems, NeurIPS 2023-
dc.identifier.conferencecountryUS-
dc.identifier.conferencelocationNew Orleans Ernest N. Morial Convention Center-
dc.contributor.localauthorAhn, Sungjin-
dc.contributor.nonIdAuthorKim, Yeongbin-
dc.contributor.nonIdAuthorSingh, Gautam-
dc.contributor.nonIdAuthorPark, Junyeong-
dc.contributor.nonIdAuthorGulcehre, Caglar-
Appears in Collection
CS-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0