DiffusionCLIP Text-Guided Diffusion Models for Robust Image Manipulation

Cited 91 time in webofscience Cited 0 time in scopus
  • Hit : 117
  • Download : 0
Recently, GAN inversion methods combined with Contrastive Language-Image Pretraining (CLIP) enables zeroshot image manipulation guided by text prompts. However, their applications to diverse real images are still difficult due to the limited GAN inversion capability. Specifically, these approaches often have difficulties in reconstructing images with novel poses, views, and highly variable contents compared to the training data, altering object identity, or producing unwanted image artifacts. To mitigate these problems and enable faithful manipulation of real images, we propose a novel method, dubbed DiffusionCLIP, that performs textdriven image manipulation using diffusion models. Based on full inversion capability and high-quality image generation power of recent diffusion models, our method performs zeroshot image manipulation successfully even between unseen domains and takes another step towards general application by manipulating images from a widely varying ImageNet dataset. Furthermore, we propose a novel noise combination method that allows straightforward multi-attribute manipulation. Extensive experiments and human evaluation confirmed robust and superior manipulation performance of our methods compared to the existing baselines. Code is available at https://github.com/gwang-kim/DiffusionCLIP.git
Publisher
IEEE Computer Society
Issue Date
2022-06
Language
English
Citation

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, pp.2416 - 2425

ISSN
1063-6919
DOI
10.1109/CVPR52688.2022.00246
URI
http://hdl.handle.net/10203/312702
Appears in Collection
AI-Conference Papers(학술대회논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 91 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0