Learning to disentangle latent physical factors of deformable faces

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 219
  • Download : 0
We proposed a monocular image disentanglement framework based on a compositional model. Our model disentangles the input image into its constituent components of albedo, depth, deformation, pose, and illumination. Instead of relying on any handcrafted priors, we trained our deep neural network to understand the physical meaning of each element by mimicking real-world operations, allowing it to reconstruct images in a self-supervised manner. Our model, trained on multi-frame images of each subject, demonstrates a better understanding of the objects without requiring any supervision or strong model assumptions. We utilized a deformation-free canonical space to align multi-frame images in the same space. This approach enables the understanding of information from multi-frame images in the same space. Our experiments showed that our approach accurately disentangled the physical elements of deformable faces from images with wide variations found in the wild.
Publisher
SPRINGER
Issue Date
2023-08
Language
English
Article Type
Article
Citation

VISUAL COMPUTER, v.39, no.8, pp.3481 - 3494

ISSN
0178-2789
DOI
10.1007/s00371-023-02948-1
URI
http://hdl.handle.net/10203/311830
Appears in Collection
CS-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0