Learning-based image synthesis with disentangled representations분해 표현을 이용한 학습 기반 이미지 합성

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 177
  • Download : 0
A disentangled representation separates the explanatory generative factors of data within the representation, offering desirable properties such as interpretability and controllability. Recent methods for unsupervised disentanglement learning show their promise on simple data but often yield unsatisfactory results on real-world complex data. This issue can be alleviated by incorporating human prior knowledge or additional learning objectives into the disentangling process, which is explored in this dissertation. We propose two disentanglement learning methods with (1) shape supervision and (2) category supervision and employ them for image synthesis. For virtual clothing try-on (VTO) applications, the first method synthesizes clothing segments via disentangling their underlying factors (i.e., shape and style). An encoder separates style features from shape features that are defined as the foreground masks of segments. A generator combines these features to produce clothing segments, which are further superimposed on person images for try-on. Moreover, we propose an evaluation metric to assess how well the generator synthesizes styles. Unlike recent VTO works with full-image synthesis, our disentangling strategy enables segment-level synthesis and yields several benefits including accurate style expression and easy data collection. Experiments on fashion-parsing datasets and a VTO benchmark show the generation of high-quality clothing segments and the superiority of our method over existing synthesis methods. Additionally, we compare our method with neural style transfer and visualize the different concepts of style.For controllable image synthesis, the second method separates the generative factors of images (i.e., content and style) into two latent vectors in a variational autoencoder. Under class supervision with partially available labels, one vector captures content factors relevant to the classification. The other vector captures style factors related to the remaining variation. This separation is boosted by a learning objective to encourage statistical independence between the vectors, called vector independence. We reveal that (i) this independence term exists in decomposing the evidence lower bound with two latent vectors, and (ii) penalizing this term along with the total correlation leads to good disentanglement learning. Experiments on MNIST and Fashion-MNIST datasets demonstrate the effectiveness of our method for improving image classification and synthesis. Furthermore, experiments on dSprites dataset quantitatively show the relation between vector independence and disentanglement. We believe that this research contributes to the advancement of learning disentangled representations and improving controllability of machine learning methods.
Advisors
Dae-Shik Kimresearcher김대식researcherSoo-Young Leeresearcher이수영researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2020
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전기및전자공학부, 2020.2,[vi, 97 p. :]

Keywords

disentanglement learning▼adisentangled representations▼aimage synthesis▼aneural network▼avariational autoencoder▼asemi-supervised learning▼avector independence▼avirtual try-on; 분해 표현 학습▼a분해 표현▼a이미지 합성▼a신경회로망▼a변분 오토인코더▼a준 지도 학습▼a벡터 독립성▼a가상 옷입히기

URI
http://hdl.handle.net/10203/284277
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=915168&flag=dissertation
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0