DSpace at KOASAS: Controllable waveform-domain diffusion model for event-guided foley sound synthesis

DSpace at KOASAS

College of Engineering(공과대학)Kim Jaechul Graduate School of AI(김재철AI대학원)AI-Theses_Master(석사논문)

Controllable waveform-domain diffusion model for event-guided foley sound synthesis제어 가능한 이벤트 가이딩 폴리 사운드 합성을 위한 웨이브폼 도메인에서의 디퓨전 모델 활용

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 2
Download : 0

Export

Chung, Yoonjin / 정윤진

This paper addresses the challenge of generating realistic and event-aligned Foley sound effects, which play a crucial role in enhancing the immersive experience of various media forms. We propose a generative audio synthesis system that incorporates sound class category and event timing conditions to generate appropriate waveforms. To preserve temporal information and enhance synchronization with specific events, we introduce Block-FiLM, a block-wise feature linear modulation method. Our approach is demonstrated to significantly improve the quality and alignment of generated sounds by experiments and ablation studies. Evaluation results based on objective metrics and subjective listening tests confirm the effectiveness of our approach. Overall, this work contributes to the advancement of Foley sound synthesis and indicates the potential of generative models for automating and streamlining sound production in various domains.

Advisors: 남주한 researcher

Description: 한국과학기술원 :김재철AI대학원,

Publisher: 한국과학기술원

Issue Date: 2023

Identifier: 325007

Language: eng

Description: 학위논문(석사) - 한국과학기술원 : 김재철AI대학원, 2023.8,[iv, 29 p. :]

Keywords: 폴리 사운드 합성▼a타이밍 가이던스▼a웨이브폼 도메인 디퓨전; Foley sound synthesis▼aTiming guidance▼aWaveform domain diffusion

URI: http://hdl.handle.net/10203/320543

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1045731&flag=dissertation

Appears in Collection: AI-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Controllable waveform-domain diffusion model for event-guided foley sound synthesis제어 가능한 이벤트 가이딩 폴리 사운드 합성을 위한 웨이브폼 도메인에서의 디퓨전 모델 활용

KOASAS

Communities & Collections