Deep semantic visual embeddings with spatial relationships공간적 위치 관계성을 고려한 의미론적 시각 임베딩

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 436
  • Download : 0
Understanding the relationships between objects in an image is an important problem in computer vision. Recently, methods for concerning the relationships have been proposed in many vision tasks, but there are few studies in the semantic-visual embedding problem. In this paper, we first propose a new dataset called R-CLEVR to concentrate on the relations between objects in semantic-visual problems, and we introduce an Object Phase Module (OPM) that focuses on relative locations of objects in an image. Experiments demonstrate that our proposed network with object phase module has the highest performance in cross-modal retrieval and phrase grounding problems on R-CLEVR datasets. Furthermore, our model demonstrates meaningful performance on MS-COCO dataset which has a relatively small number of object relations.
Advisors
Kim, Dae-Shikresearcher김대식researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2019
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2019.2,[v, 26 p. :]

Keywords

Deep learning▼acomputer vision▼amulti modal▼aimage and text understanding▼asemantic visual embeddings; 딥러닝▼a컴퓨터 비전▼a멀티모달▼a이미지-텍스트 이해▼a의미론적 시각 임베딩

URI
http://hdl.handle.net/10203/266712
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=843408&flag=dissertation
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0