DSpace at KOASAS: Style-based audio-driven talking head generation

DSpace at KOASAS

College of Engineering(공과대학)Kim Jaechul Graduate School of AI(김재철AI대학원)AI-Theses_Master(석사논문)

Style-based audio-driven talking head generation스타일 기반의 음성에 따른 얼굴 비디오 생성

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 167
Download : 0

Export

Song, Minyoung

While audio-driven talking head generation has achieved highly realistic multi-speaker generation, previous works rely on predefined additional data such as 3D model parameters, landmarks, and head pose angles. However, these explicit supervisions are expensive as scanning 3D models require special devices in a controlled lab environment, and landmarks are a manual annotation. In this paper, we propose a novel multi-speaker talking video generation framework that does not use any predefined prior for the first time. We first design a novel style code manipulator that explores the latent space of pretrained StyleGAN3 and generates a sequence of style codes within the distribution of the generator. In this way, we achieve identity-preserving head pose matching without any support of predefined supervision. Furthermore, by leveraging the power of StyleGAN3, our framework achieves high-quality video generation. Finally, we adopt sync loss, computed from an expert discriminator that maps audio and visual features to unified space, for better lip synchronization. Our framework is fully unsupervised since we do not include any model trained with additional data. Experimental results show that our method can generate high-quality video results and show competitive performance with the state-of-the-art methods that use supervision.

Advisors: Hwang, Sung Ju researcher; 황성주 researcher

Description: 한국과학기술원 :김재철AI대학원,

Publisher: 한국과학기술원

Issue Date: 2022

Identifier: 325007

Language: eng

Description: 학위논문(석사) - 한국과학기술원 : 김재철AI대학원, 2022.2,[iii, 17 p. :]

URI: http://hdl.handle.net/10203/308230

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=997676&flag=dissertation

Appears in Collection: AI-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Style-based audio-driven talking head generation스타일 기반의 음성에 따른 얼굴 비디오 생성

KOASAS

Communities & Collections