DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Yoo, Chang Dong | - |
dc.contributor.advisor | 유창동 | - |
dc.contributor.author | Kang, Sunghun | - |
dc.contributor.author | 강성훈 | - |
dc.date.accessioned | 2017-03-29T02:38:24Z | - |
dc.date.available | 2017-03-29T02:38:24Z | - |
dc.date.issued | 2016 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=649575&flag=dissertation | en_US |
dc.identifier.uri | http://hdl.handle.net/10203/221765 | - |
dc.description | 학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2016.2 ,[iv, 25 p. :] | - |
dc.description.abstract | Over the last few decades, many papers about video categorization have been published. Despite of rich information in videos, previous algorithms for video categorization mainly rely on fusing multiple visual features including static and motion information. On the other words, the previous models does not utilize the audio information. In this paper, we propose a framework of video categorization which utilize both visual and auditory information from given videos and investigate diffierent types of deep features. The framework consists of feature extractor for each modality and fusion to generate audiovisual feature. For visual feature, we fine-tuned the AlexNet to obtain better discriminative features and measured the performance. Two methods are used and evaluated for capturing audio information from videos, 1D-CNN and bag of word representation. The highest mean average precision scores are achieved audiovisual features which are consists of fine-tuned AlexNet and bag of word representation for MFCCs. From the results, we proved audiovisual features help to categorize videos without any degeneration of performance. | - |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.subject | Video Categorization | - |
dc.subject | Deep Learning | - |
dc.subject | Audiovisual | - |
dc.subject | Multi-modal | - |
dc.subject | Convolutional Neural Network | - |
dc.subject | 영상분류 | - |
dc.subject | 심화학습 | - |
dc.subject | 시청각 | - |
dc.subject | 멀티모달 | - |
dc.subject | 컨볼루션 신경망 | - |
dc.title | (A) study on audiovisual deep features for video categorization | - |
dc.title.alternative | 영상 분류를 위한 시청각적 심층 특징에 관한 연구 | - |
dc.type | Thesis(Master) | - |
dc.identifier.CNRN | 325007 | - |
dc.description.department | 한국과학기술원 :전기및전자공학부, | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.