DC Field | Value | Language |
---|---|---|
dc.contributor.author | Kim, Dahun | ko |
dc.contributor.author | Woo, Sanghyun | ko |
dc.contributor.author | Lee, Joon-Young | ko |
dc.contributor.author | Kweon, In So | ko |
dc.date.accessioned | 2022-09-06T03:00:51Z | - |
dc.date.available | 2022-09-06T03:00:51Z | - |
dc.date.created | 2022-09-06 | - |
dc.date.created | 2022-09-06 | - |
dc.date.created | 2022-09-06 | - |
dc.date.issued | 2022 | - |
dc.identifier.citation | IEEE TRANSACTIONS ON IMAGE PROCESSING, v.31, pp.5383 - 5395 | - |
dc.identifier.issn | 1057-7149 | - |
dc.identifier.uri | http://hdl.handle.net/10203/298370 | - |
dc.description.abstract | A holistic understanding of dynamic scenes is of fundamental importance in real-world computer vision problems such as autonomous driving, augmented reality and spatio-temporal reasoning. In this paper, we propose a new computer vision benchmark: Video Panoptic Segmentation (VPS). To study this important problem, we present two datasets, Cityscapes-VPS and VIPER together with a new evaluation metric, video panoptic quality (VPQ). We also propose VPSNet++, an advanced video panoptic segmentation network, which simultaneously performs classification, detection, segmentation, and tracking of all identities in videos. Specifically, VPSNet++ builds upon a top-down panoptic segmentation network by adding pixel-level feature fusion head and object-level association head. The former temporally augments the pixel features while the latter performs object tracking. Furthermore, we propose panoptic boundary learning as an auxiliary task, and instance discrimination learning which learns spatio-temporally clustered pixel embedding for individual thing or stuff regions, i.e., exactly the objective of the video panoptic segmentation problem. Our VPSNet++ significantly outperforms the default VPSNet, i.e., FuseTrack baseline, and achieves state-of-the-art results on both Cityscapes-VPS and VIPER datasets. The datasets, metric, and models are publicly available at https://github.com/mcahny/vps. | - |
dc.language | English | - |
dc.publisher | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC | - |
dc.title | Dense Pixel-Level Interpretation of Dynamic Scenes With Video Panoptic Segmentation | - |
dc.type | Article | - |
dc.identifier.wosid | 000842776300009 | - |
dc.identifier.scopusid | 2-s2.0-85133760600 | - |
dc.type.rims | ART | - |
dc.citation.volume | 31 | - |
dc.citation.beginningpage | 5383 | - |
dc.citation.endingpage | 5395 | - |
dc.citation.publicationname | IEEE TRANSACTIONS ON IMAGE PROCESSING | - |
dc.identifier.doi | 10.1109/TIP.2022.3183440 | - |
dc.contributor.localauthor | Kweon, In So | - |
dc.contributor.nonIdAuthor | Lee, Joon-Young | - |
dc.description.isOpenAccess | N | - |
dc.type.journalArticle | Article | - |
dc.subject.keywordAuthor | Task analysis | - |
dc.subject.keywordAuthor | Image segmentation | - |
dc.subject.keywordAuthor | Measurement | - |
dc.subject.keywordAuthor | Electron tubes | - |
dc.subject.keywordAuthor | Semantics | - |
dc.subject.keywordAuthor | Head | - |
dc.subject.keywordAuthor | Benchmark testing | - |
dc.subject.keywordAuthor | Video panoptic segmentation | - |
dc.subject.keywordAuthor | panoptic segmentation | - |
dc.subject.keywordAuthor | video instance segmentation | - |
dc.subject.keywordAuthor | video semantic segmentation | - |
dc.subject.keywordAuthor | scene parsing | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.