DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | 최호진 | - |
dc.contributor.author | Lee, Jong-Whoa | - |
dc.contributor.author | 이종화 | - |
dc.date.accessioned | 2024-07-25T19:31:23Z | - |
dc.date.available | 2024-07-25T19:31:23Z | - |
dc.date.issued | 2023 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1045950&flag=dissertation | en_US |
dc.identifier.uri | http://hdl.handle.net/10203/320718 | - |
dc.description | 학위논문(석사) - 한국과학기술원 : 전산학부, 2023.8,[iii, 29 p. :] | - |
dc.description.abstract | Human action recognition (HAR) aims to understand human behaviors and predict the correct answer to each action using various visual information such as RGB video, infrared video, depth information video, or skeleton information as input data. In action recognition, the action may be expressed by different movements depending on the performers or interpreted as different actions in a specific domain. Such expressions make it challenging to prepare sufficient data for the learning of action recognition models. Thus, we consider an efficient method that can be trained with few samples and applied its potential features to other domains such as knowledge distillation. In this paper, we propose a teacher-student network to learn the representations from the given actions based on the skeleton sequences and textual information describing each action. Our teacher network consists of two encoders: a skeleton encoder, which is a graph-based model to fit the structure of skeletons, and a text encoder which is pre-trained with large-scale datasets. The teacher network uses the skeleton sequences and additional textual information of the synonyms of the action labels to provide cross-modality to the student network. Furthermore, the student network contains only a skeleton encoder same as the teacher to learn the semantic relationships guided by the knowledge of the teacher. Experiments on one-shot HAR using the public dataset NTU RGB+D120 demonstrate the state-of-the-art performance of the proposed method. | - |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.subject | 행동 인식▼a관절 기반 행동 인식▼a관절 정보▼a원샷 기반 학습▼a크로스 모달 지식 증류▼a교사-학생 네트워크 | - |
dc.subject | human action recognition▼askeleton-based human action recognition▼askeleton information▼aone-shot learning▼across-modal knowledge distillation▼ateacher-student networks | - |
dc.title | Cross-modal knowledge distillation for one-shot human action recognition | - |
dc.title.alternative | 원샷 행동 인식을 위한 크로스 모달 지식 증류 방법 | - |
dc.type | Thesis(Master) | - |
dc.identifier.CNRN | 325007 | - |
dc.description.department | 한국과학기술원 :전산학부, | - |
dc.contributor.alternativeauthor | Choi, Ho-Jin | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.