Ladder-deep convolutional recurrent writer for generating images = 이미지 생성을 위한 딥 래더 컨볼루셔널 순환형 작성자 네트워크

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 145
  • Download : 0
In this thesis, we propose a deep recursive autoencoder based architecture with enhanced interaction between the encoder and the decoder networks to improve its performance for image generation. In the first part of the thesis, we modify the architecture of deep recurrent attentive writer(DRAW) by replacing the RNN at the encoder with CNN because in more than one spatio-temporal domains and even in images it is difficult to use RNNs for feature learning. This is mainly because RNNs need to remember far back in the time to look for the pixels which are horizontally or vertically aligned. In addition, CNNs are commonly used for image processing tasks and they give the state of the art performance for them. In the second part of the thesis, the model is further modified to increase its expressiveness and eventually the performance. In order to do this multiple stochastic layers are introduced in the architecture, which help the model in generating the complex data. Moreover, the interaction between the inference and the generation networks is increased by adding the skip connections between the recognizer and the generator networks, this makes the generation of data more effective. Three variants of Ladder deep convolutional recurrent writer(L-DCRW) are proposed with increased interaction between the recognizer network and the generator network. The first architecture trains the network to get the posterior by combining the mean and variance of recognizer network (which acts as Gaussian likelihoods) and mean and variance of generator network (which can be considered as priors). In the second architecture, skip connections between the inference network and the generation network are introduced at the higher layers of network such that, the higher layers instead of capturing all the information now only needs to learn the abstract representations. Finally, the architecture with the skip connections at all the layers is presented. Furthermore, in the last chapter of this thesis the same idea of ladder network is also applied to and tested with the DRAW architecture. All the architectures are tested on MNIST and Omniglot datasets and the results are analyzed.
Advisors
Kim, Jong Hwanresearcher김종환researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2017
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2017.8,[iv, 47 p. :]

Keywords

Variational Autoencoders▼aConvolutional Neural Networks▼aDRAW▼aladder variational autoencoder▼aRecurrent Neural Network and visual attention

URI
http://hdl.handle.net/10203/243382
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=718693&flag=dissertation
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0