Attention-based video masking for improving open set action recognition오픈셋 행동 인식 향상을 위한 어텐션 기반 비디오 마스킹

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 4
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisor최호진-
dc.contributor.authorSim, Minho-
dc.contributor.author심민호-
dc.date.accessioned2024-07-25T19:31:24Z-
dc.date.available2024-07-25T19:31:24Z-
dc.date.issued2023-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1045957&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/320725-
dc.description학위논문(석사) - 한국과학기술원 : 전산학부, 2023.8,[iv, 35 p. :]-
dc.description.abstractIn real-world scenarios, human action recognition (HAR) is essentially an open set problem that requires a model to classify actions from known classes and detect actions from unknown classes simultaneously. However, HAR models are easily biased to static information in the video (e.g., background), which can lead to performance degradation of open set action recognition (OSAR) models. In this paper, we propose a simple framework for improving OSAR based on the video attention map extracted from the video vision transformer model. Specifically, our framework eliminates patches with static bias in video using two debiasing steps: (1) frame selection and (2) patch masking. Experimental results show that our framework achieves consistent performance improvement on multiple OSAR methods and challenging benchmarks. Furthermore, we introduce two new OSAR tasks, Kinetics-400 vs. Kinetics-600 exclusive and Kinetics-400 vs. Kinetics-700 exclusive, to validate our method in a setting close to the real-world scenario. With extensive experiments, we demonstrate the effectiveness of our attention-based masking, and in-depth analysis validates the effect of static bias on OSAR.-
dc.languageeng-
dc.publisher한국과학기술원-
dc.subject오픈셋 행동 인식▼a비디오 마스킹▼a어텐션 맵▼a비디오 비전 트랜스포머-
dc.subjectOpen set action recognition▼avideo masking▼aattention map▼avideo vision transformer-
dc.titleAttention-based video masking for improving open set action recognition-
dc.title.alternative오픈셋 행동 인식 향상을 위한 어텐션 기반 비디오 마스킹-
dc.typeThesis(Master)-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :전산학부,-
dc.contributor.alternativeauthorChoi, Ho-Jin-
Appears in Collection
CS-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0