Disjoint multi-task learning between heterogeneous action and caption data이형의 행동인식과 캡션 데이터 간의 멀티태스크 학습 기법

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 640
  • Download : 0
It is believed that the success of deep neural networks on various image tasks is achieved by virtue of a large number of annotated data. When it comes to video related tasks, while there have been various datasets, the number of annotated videos in a single dataset is still far less than that of image datasets. In this paper, we leverage existing video datasets that have heterogeneous videos and annotations, so that a model can be trained while compensating for the limit of a single dataset size. Since the video data in each dataset has heterogeneous annotations, traditional multi-task learning is not available in this scenario. To this end, we propose a simple alternating directional optimization method to efficiently learn from the heterogeneous data. We demonstrate the effectiveness of our model on both action recognition and caption embedding tasks. With our method, we show performance improvements on action recognition task and comparable performance on sentence retrieval task to the model trained on a single-task data.
Advisors
Kweon, In Soresearcher권인소researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2017
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2017.2,[v, 48 p. :]

Keywords

Deep learning; Action Recognition; Visual semantic embedding; Multi-task learning; Machine learning; 딥러닝; 행동인식; 세맨틱 임베딩; 멀티태스크 학습; 기계학습

URI
http://hdl.handle.net/10203/243250
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=675355&flag=dissertation
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0