Emotional singing voice synthesis by changing duration, vibrato and timbre음 길이, 비브라토 그리고 음색의 변화를 이용한 감정 노래 합성

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 697
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorYoo, Chang-Dong-
dc.contributor.advisor유창동-
dc.contributor.authorPark, Youn-Sung-
dc.contributor.author박윤성-
dc.date.accessioned2011-12-28T02:18:45Z-
dc.date.available2011-12-28T02:18:45Z-
dc.date.issued2010-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=455132&flag=dissertation-
dc.identifier.urihttp://hdl.handle.net/10203/54248-
dc.description학위논문(석사) - 한국과학기술원 : 로봇공학학제전공, 2010.08, [ vi, 33 p. ]-
dc.description.abstractIn this thesis, a novel emotional singing voice synthesis system is considered. There were various approaches to express emotion between human and machine or robot through varying facial expression, action and synthesized speech of a robot. Although singing is known as an effective way for expressing emotion, there is no research using singing to express emotion. To synthesize the singing voice with emotion, the statistical parametric synthesis system is used. The statistical parametric synthesis system uses a singing database which is composed of various melodies sung neutrally with restricted set of words and hidden semi-Markov models (HSMMs) of notes ranging from G3 to E5 to construct statistical information. The procedure of statistical parametric synthesis system is composed of mainly two parts, training and synthesis. In training part, both spectrum and excitation parameter are extracted from a singing database, and the statistical information of spectrum and excitation parameter for each note is constructed. Three steps are taken in the synthesis part: (1) Pitch and duration are determined according to the notes indicated by the musical score; (2) Features are sampled from appropriate HSMMs with the duration set to the maximum probability; (3) Singing voice is synthesized by the mel-log spectrum approximation (MLSA) filter using the sampled features as parameters of the filter. Emotion of a synthesized song is controlled by varying the duration, the vibrato parameters and the timbre according to the Thayer`s mood model which defines emotions in tense and energy axis. Perception test is performed to evaluate the synthesized song. The results show that the algorithm can control the expressed emotion of a singing voice given a neutral singing database.eng
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectVibrato model-
dc.subjectEmotion expression-
dc.subjectStatistical singing voice synthesis-
dc.subjectTimbre conversion filter-
dc.subject음색 변조 필터-
dc.subject비브라토 모델-
dc.subject감정 표현-
dc.subject통계학적 노래합성-
dc.titleEmotional singing voice synthesis by changing duration, vibrato and timbre-
dc.title.alternative음 길이, 비브라토 그리고 음색의 변화를 이용한 감정 노래 합성-
dc.typeThesis(Master)-
dc.identifier.CNRN455132/325007 -
dc.description.department한국과학기술원 : 로봇공학학제전공, -
dc.identifier.uid020084053-
dc.contributor.localauthorYoo, Chang-Dong-
dc.contributor.localauthor유창동-
Appears in Collection
RE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0