Speech synthesis and speaker modification based on two-band speech model2대역 음성모델에 기반한 음성합성 및 화자변환

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 684
  • Download : 0
Speech analysis/synthesis is a technique for analyzing speech signal, converting it to suitable parameters, modifying and resynthesizing speech signal from them, and it is essential for high-quality speech synthesis, speech coding, and seaker modification.For a speech analysis/synthesis, a lot of speech models based on a speech production mechanism have been proposed, and they represent speech signal by several meaningful sets of model parameters. Two-band speech model that is a simplified form of a harmonic/stochastic (H/S) model assumes that voiced and unvoiced characteristics can be mixed in one speech frame and their regions are divided into two bands by the time-varying frequency. The voiced region (periodic part) that has strong periodic characteristics is generally modeled by a sum of sinusoids, whereas the unvoiced region (random part) that does not have periodic characteristics is modeled by a linear filtered signal of white Gaussian noises. The frequency dividing periodic part and random part is called as band-splitting frequency. Since an accurate separation of two parts is a key part of the two-band speech model, it is very important to determine the reasonable band-splitting frequency for the high-quality synthesized speech. In this thesis, a new score function for splitting periodic and random parts of two-band speech model is proposed and the algorithm determining the band-splitting frequency by choosing the value that maximizes the function is described. At first, the combined subband periodicity score (CSPS) function defined as a sum of a periodicity score of lower band spectrum and an non-periodicity score of upper band spectrum for an arbitrary frequency is computed by an autocorrelation function. Furthermore, a recurrence relation is derived for reducing the computational complexity of the CSPS function and a tracking technique for guaranteeing the continuity between neighboring frames is proposed. Experimental results have shown that the pr...
Advisors
Oh, Yung-Hwanresearcher오영환researcher
Description
한국과학기술원 : 전산학전공,
Publisher
한국과학기술원
Issue Date
2003
Identifier
181183/325007 / 000975062
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전산학전공, 2003.2, [ ix, 82 p. ]

Keywords

speech analysis; speech coding; voice conversion; speech synthesis; speech model; 음성모델; 음성분석; 음성부호화; 음성변환; 음성합성

URI
http://hdl.handle.net/10203/32836
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=181183&flag=dissertation
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0