Where is your player : Deep pixel-wise visual localization on baseball game data via text-phrase = 야구 게임 데이터에 대한 텍스트 문구 기반 심층 픽셀 단위 시각적 위치 추정Deep pixel-wise visual localization on baseball game data via text-phrase

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 249
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorYoo, Chang D.-
dc.contributor.advisor유창동-
dc.contributor.authorKim, Minsu-
dc.date.accessioned2019-09-04T02:43:00Z-
dc.date.available2019-09-04T02:43:00Z-
dc.date.issued2018-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=733990&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/266852-
dc.description학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2018.2,[iii, 28 p. :]-
dc.description.abstractThis paper considers a network referred to as p-LocalNet for pixel accuracy localization of the object referred to by the given input text-phrase. Given an image with a text-phrase describing an object of interest, the network is to localize the region of the object with pixel accuracy referred to by the text-phrase. To achieve this task, p-LocalNet associates visual representation with linguistic representation according to spatial area. The input text-phrase is fed into a long short-term memory network (LSTM) in generating local and global weights that can be associated with both spatially local and global visual representations of the input image. The spatially local and global visual representations of the input image are extracted from multi-level feature maps of convolutional neural network (CNN). To associate each visual representation with each weight, two stream feature-wise linear modulation (FiLM) are employed. To evaluate p-LocalNet, a small subset of MSCOCO dataset related only to baseball is collected and manually labeled. We refer to this dataset as the Baseball Game Dataset (BG-Dataset). The images are manually selected, and each image is described in detail and labeled in a binary map highlighting the object. The experimental results demonstrate that BG-Dataset is well organized to localize the object based on text-phrase, and p-LocalNet is capable of localizing the object with high pixel accuracy.-
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectDeep learning▼apixel-wise localization▼aobject selection▼amulti-modal▼asegmentation-
dc.subject심층 학습▼a픽셀 정확도 위치 추정▼a객체 선택▼a다중 모달▼a영역 분할-
dc.titleWhere is your player : Deep pixel-wise visual localization on baseball game data via text-phrase = 야구 게임 데이터에 대한 텍스트 문구 기반 심층 픽셀 단위 시각적 위치 추정-
dc.title.alternativeDeep pixel-wise visual localization on baseball game data via text-phrase-
dc.typeThesis(Master)-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :전기및전자공학부,-
dc.contributor.alternativeauthor김민수-
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0