Singing voice generation and phrase emphasis using glottal-waveform성대파를 이용한 가창음성 생성 및 어구 강조

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 556
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorOh, Yung-Hwan-
dc.contributor.advisor오영환-
dc.contributor.authorBae, Jae-Hyun-
dc.contributor.author배재현-
dc.date.accessioned2011-12-13T05:27:58Z-
dc.date.available2011-12-13T05:27:58Z-
dc.date.issued2011-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=466474&flag=dissertation-
dc.identifier.urihttp://hdl.handle.net/10203/33334-
dc.description학위논문(박사) - 한국과학기술원 : 전산학과, 2011.2, [ viii, 59 p. ]-
dc.description.abstractResearch on the speech synthesis area are performed mainly about the plain read speech sentence generation and the quality of the synthesized speech is improved exceedingly. Recent days, dialogic speech style, emotional, expressive speech synthesis area is widely being studied. And researches about the voice quality which means the color of the voice are performed also. Among these, studies on the expressive TTS are mainly focused on corpus based method. this method records the pronunciations of various circumstances and use proper units among them for the proper context. In the corpus based synthesis, natural speech segments are used with almost no modification. The advantage of this way is that synthetic speech is very natural. But there is some disadvantages also. One of which is that we have to record huge amount of speech sentences to cope with various circumstances. Therefore in the unprepared context, naturalness of the synthetic speech may be degraded. Another disadvantage is that the synthetic speech may have different prosody compared to the target prosody that prosody module produce. Among the research area on the voice color, area on the glottal waveform is widely performed. In this area, modeling and modifying the the glottal waveform are studied and produces the high quality synthetic speech. In this paper, we want to generate the speech in which key phrase is emphasized from the plain read speech sentence by transforming the glottal waveform. Plain synthetic speech sentence of conventional TTS system cannot express the speaker`s intention. On the contrary, in the real environment, people may emphasize the keyword or phrase which they want to deliver clearly. The emphasized keyword or phrase has strong voice color than other phrases. By utilizing this phenomenon, we want to emphasize the keyword or phrase which is the contextual core in the sentence. we use glottal waveforms to make the re-synthesized speech be more natural. To estimate the glo...eng
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectGlottal waveform Transformation-
dc.subjectPhrase Emphasis-
dc.subjectSpeech Synthesis-
dc.subjectSinging Voice Generation-
dc.subject가창음성 생성-
dc.subject성대파 변환-
dc.subject어구 강조-
dc.subject음성 합성-
dc.titleSinging voice generation and phrase emphasis using glottal-waveform-
dc.title.alternative성대파를 이용한 가창음성 생성 및 어구 강조-
dc.typeThesis(Ph.D)-
dc.identifier.CNRN466474/325007 -
dc.description.department한국과학기술원 : 전산학과, -
dc.identifier.uid020005820-
dc.contributor.localauthorOh, Yung-Hwan-
dc.contributor.localauthor오영환-
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0