Multimodal representation멀티모달 표현방법 : Kneser-Ney 평활법/스킵 그램에 기반하는 신경 언어 모델

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 613
  • Download : 0
This paper considers a multimodal representation that associates image features to text such that the conditional probability of the next word given past n words and image features is defined by a neural language model for image retrieval and text generation. By contrast to previous representations, our representation is learned to resolve the issue of data sparsity that has been a deteriorative cause for any neural language model in the evaluation. Specifically, we make use of Kneser-Ney smoothing and skip-gram techniques in order to integrate each of them to a multimodal neural language model, e.g., the Modality-biased Log-bilinear model. As a result, the prediction for the next word using the conditional probability is developed to produce better contextual consistency within one unit of each modality, i.e., one sentence or one image. On the other hand, the correspondence of image and text is also enhanced. The representation is validated on the IAPR TC-12 and Attribute Discovery datasets for image retrieval and text generation, demonstrating improved performance on perplexity and BLEU-n criteria and effective shared representation learning.
Advisors
Yoo, Chang Dongresearcher유창동researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2016
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2016.2 ,[iv, 24 p. :]

Keywords

Multimodal Representation; Neural Language Model; Kneser-Ney; Skip-gram; Image Retrieval; Image Query; Text Generation; 멀티모달 표현방법; 신경 언어 모델; 스킵 그램; 이미지 복구; 이미지 검색; 문장 생성

URI
http://hdl.handle.net/10203/221766
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=649621&flag=dissertation
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0