DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Oh, Yung-Hwan | - |
dc.contributor.advisor | 오영환 | - |
dc.contributor.author | Bae, Jae-Hyun | - |
dc.contributor.author | 배재현 | - |
dc.date.accessioned | 2011-12-13T05:27:58Z | - |
dc.date.available | 2011-12-13T05:27:58Z | - |
dc.date.issued | 2011 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=466474&flag=dissertation | - |
dc.identifier.uri | http://hdl.handle.net/10203/33334 | - |
dc.description | 학위논문(박사) - 한국과학기술원 : 전산학과, 2011.2, [ viii, 59 p. ] | - |
dc.description.abstract | Research on the speech synthesis area are performed mainly about the plain read speech sentence generation and the quality of the synthesized speech is improved exceedingly. Recent days, dialogic speech style, emotional, expressive speech synthesis area is widely being studied. And researches about the voice quality which means the color of the voice are performed also. Among these, studies on the expressive TTS are mainly focused on corpus based method. this method records the pronunciations of various circumstances and use proper units among them for the proper context. In the corpus based synthesis, natural speech segments are used with almost no modification. The advantage of this way is that synthetic speech is very natural. But there is some disadvantages also. One of which is that we have to record huge amount of speech sentences to cope with various circumstances. Therefore in the unprepared context, naturalness of the synthetic speech may be degraded. Another disadvantage is that the synthetic speech may have different prosody compared to the target prosody that prosody module produce. Among the research area on the voice color, area on the glottal waveform is widely performed. In this area, modeling and modifying the the glottal waveform are studied and produces the high quality synthetic speech. In this paper, we want to generate the speech in which key phrase is emphasized from the plain read speech sentence by transforming the glottal waveform. Plain synthetic speech sentence of conventional TTS system cannot express the speaker`s intention. On the contrary, in the real environment, people may emphasize the keyword or phrase which they want to deliver clearly. The emphasized keyword or phrase has strong voice color than other phrases. By utilizing this phenomenon, we want to emphasize the keyword or phrase which is the contextual core in the sentence. we use glottal waveforms to make the re-synthesized speech be more natural. To estimate the glo... | eng |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.subject | Glottal waveform Transformation | - |
dc.subject | Phrase Emphasis | - |
dc.subject | Speech Synthesis | - |
dc.subject | Singing Voice Generation | - |
dc.subject | 가창음성 생성 | - |
dc.subject | 성대파 변환 | - |
dc.subject | 어구 강조 | - |
dc.subject | 음성 합성 | - |
dc.title | Singing voice generation and phrase emphasis using glottal-waveform | - |
dc.title.alternative | 성대파를 이용한 가창음성 생성 및 어구 강조 | - |
dc.type | Thesis(Ph.D) | - |
dc.identifier.CNRN | 466474/325007 | - |
dc.description.department | 한국과학기술원 : 전산학과, | - |
dc.identifier.uid | 020005820 | - |
dc.contributor.localauthor | Oh, Yung-Hwan | - |
dc.contributor.localauthor | 오영환 | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.