NeRFFaceSpeech: one-shot audio-diven 3D talking head synthesis via generative prior생성적 사전 지식을 이용한 단일 이미지로부터 음성 입력 기반 말하는 3D 얼굴 생성

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 1
  • Download : 0
Audio-driven talking head generation is advancing from 2D to 3D content. Notably, recent advancements leveraging Neural Radiance Field (NeRF) are in the spotlight to synthesize 3D output but they need extensive paired audio-visual data for each identity, limiting their scalability. On the other hand, some studies have demonstrated that even with a single image, it is possible to generate convincing audio-driven talking head synthesis. Despite their promise, as observed, these techniques struggle to produce accurate 3D-aware results due to insufficient information on obscured regions of a single image. In this paper, we propose our novel pipeline, NeRFFaceSpeech, which enables us to bridge the trade-off between the number of images and 3D information fidelity. Using prior knowledge of generative models combined with NeRF, our method can craft a 3D-consistent facial feature space corresponding to a single image. Following this, our approach employs ray deformation to map the audio-correlated vertex dynamics from a parametric face model to the facial feature space, ensuring realistic 3D facial motion. Moreover, to replenish the lacking information in the inner-mouth area, which can not be obtained from a given single image, we introduce LipaintNet—a novel network trained in a self-supervised manner. Lastly, our comprehensive experiments demonstrate the superiority of our pipeline for producing enhanced 3D consistency in generating audio-driven talking heads from a single image compared to previous approaches.
Advisors
노준용researcher
Description
한국과학기술원 :문화기술대학원,
Publisher
한국과학기술원
Issue Date
2024
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 문화기술대학원, 2024.2,[iv, 31 p. :]

Keywords

음성 기반 말하는 얼굴 생성▼a3D 애니메이션▼a자기 지도 학습▼a신경 방사 필드▼a생성적 사전지식; Audio-driven talking head generation▼aNeural radiance field (NeRF)▼aD-aware imaging▼aSelf-supervised learning▼aGenerative prior

URI
http://hdl.handle.net/10203/321390
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1096175&flag=dissertation
Appears in Collection
GCT-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0