DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Oh, Yung-Hwan | - |
dc.contributor.advisor | 오영환 | - |
dc.contributor.author | Lee, Seung-Uk | - |
dc.contributor.author | 이승욱 | - |
dc.date.accessioned | 2013-09-12T01:51:36Z | - |
dc.date.available | 2013-09-12T01:51:36Z | - |
dc.date.issued | 2011 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=467932&flag=dissertation | - |
dc.identifier.uri | http://hdl.handle.net/10203/180572 | - |
dc.description | 학위논문(석사) - 한국과학기술원 : 전산학과, 2011.2, [ vi, 39 p. ] | - |
dc.description.abstract | Hidden Markov models (HMMs) are generally used to recent researches statistical parametric speech synthesis systems. An HMM is a generative model frequently used in speech recognition, which is applied to parameter generation that is prior stage to signal processing of speech synthesis. HMM-based speech synthesis has advantages including the followings: Much less storage is necessary because there is no need to keep speech corpus after training is finished. Furthermore, it is easy to get the speeches with various voice characteristics, speaking styles, and emotions by modifying the parameters. There are more advantages such as multilingual support, robustness, and ability to separately control each parameter. Commonly believed drawbacks of this kind of speech synthesis such as vocoder-like sound or unnaturalness due to speech reconstruction from parameters are being gradually overcome. However, most HMM-based speech synthesis approaches are inferior in the sense of prosody. Prosody is an important factor of verbal communication. There is a research insists that prosody has more eminent impact on communication than meaning of the words themselves. The primary weakness of HMM-based speech synthesis system in generation of prosody is that it considers prosodic features in subword units, i.e. phones. A model in the trained HMM set corresponds to a phone. Therefore, it has difficulties utilizing suprasegmental information such as relations between words, structure of the sentence, and lengths of each word, phrase, and the sentence. This leads the HMM-based system to lack the capability for creating natural speeches with human-like changes in pitch and tempo, rather it creates machine-like speeches which have the same pauses at spaces, pronounce all the words in the same way without any changes in strength or rate. Context-dependent HMM is suggested to overcome this problem; still it has not been the essential solution for prosody. We have researched generating ... | eng |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.subject | Speech synthesis | - |
dc.subject | HMM | - |
dc.subject | CART | - |
dc.subject | prosody | - |
dc.subject | 음성 합성 | - |
dc.subject | 은닉 마르코프 모델 | - |
dc.subject | 분류 및 회귀 트리 | - |
dc.subject | 운율 | - |
dc.subject | 한국어 | - |
dc.subject | Korean | - |
dc.title | HMM-based Korean speech synthesis using suprasegmental prosodic features | - |
dc.title.alternative | 초분절적 운율 정보를 이용한 HMM 기반 한국어 음성 합성 | - |
dc.type | Thesis(Master) | - |
dc.identifier.CNRN | 467932/325007 | - |
dc.description.department | 한국과학기술원 : 전산학과, | - |
dc.identifier.uid | 020083380 | - |
dc.contributor.localauthor | Oh, Yung-Hwan | - |
dc.contributor.localauthor | 오영환 | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.