DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Yoo, Chang D. | - |
dc.contributor.advisor | 유창동 | - |
dc.contributor.author | Kim, Minsu | - |
dc.date.accessioned | 2019-09-04T02:43:00Z | - |
dc.date.available | 2019-09-04T02:43:00Z | - |
dc.date.issued | 2018 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=733990&flag=dissertation | en_US |
dc.identifier.uri | http://hdl.handle.net/10203/266852 | - |
dc.description | 학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2018.2,[iii, 28 p. :] | - |
dc.description.abstract | This paper considers a network referred to as p-LocalNet for pixel accuracy localization of the object referred to by the given input text-phrase. Given an image with a text-phrase describing an object of interest, the network is to localize the region of the object with pixel accuracy referred to by the text-phrase. To achieve this task, p-LocalNet associates visual representation with linguistic representation according to spatial area. The input text-phrase is fed into a long short-term memory network (LSTM) in generating local and global weights that can be associated with both spatially local and global visual representations of the input image. The spatially local and global visual representations of the input image are extracted from multi-level feature maps of convolutional neural network (CNN). To associate each visual representation with each weight, two stream feature-wise linear modulation (FiLM) are employed. To evaluate p-LocalNet, a small subset of MSCOCO dataset related only to baseball is collected and manually labeled. We refer to this dataset as the Baseball Game Dataset (BG-Dataset). The images are manually selected, and each image is described in detail and labeled in a binary map highlighting the object. The experimental results demonstrate that BG-Dataset is well organized to localize the object based on text-phrase, and p-LocalNet is capable of localizing the object with high pixel accuracy. | - |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.subject | Deep learning▼apixel-wise localization▼aobject selection▼amulti-modal▼asegmentation | - |
dc.subject | 심층 학습▼a픽셀 정확도 위치 추정▼a객체 선택▼a다중 모달▼a영역 분할 | - |
dc.title | Where is your player | - |
dc.title.alternative | 야구 게임 데이터에 대한 텍스트 문구 기반 심층 픽셀 단위 시각적 위치 추정 | - |
dc.type | Thesis(Master) | - |
dc.identifier.CNRN | 325007 | - |
dc.description.department | 한국과학기술원 :전기및전자공학부, | - |
dc.contributor.alternativeauthor | 김민수 | - |
dc.title.subtitle | Deep pixel-wise visual localization on baseball game data via text-phrase | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.