Image captioning with 2-layer LSTM network for combining visual attributes사진 정보 결합을 위한 2단 LSTM 구조를 이용한 사진 설명문 생성 알고리즘

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 423
  • Download : 0
Image captioning is a rising ?eld of study in Arti?cial Intelligence, since both computer vision and natural language processing are involved in this task. We propose a novel 2-layer Long Short-Term Memory network architecture for generating a caption which describes an image. Our model consists of two LSTMs which play different roles in generating sentences: one is combining visual attributes extracted from a convolutional neural network and the other is decoding it. We train an image feature extractor for the purpose of extracting multiple objects as well. Our model is validated with Microsoft COCO dataset, and the results show that it outperforms other state-of-the-art models in evaluation metrics, BLEU, METEOR and CIDEr.
Advisors
Kim, Dae-Shikresearcher김대식researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2017
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2017.2,[iii, 31 p. :]

Keywords

Image captioning; Long short term memory; convolutional neural network; deep learning; artificial intelligence; 사진 설명문생성; 롱 쇼트 텀 메모리; 콘볼루셔널 뉴럴 네트워크; 딥러닝; 인공지능

URI
http://hdl.handle.net/10203/243269
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=675376&flag=dissertation
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0