Modulation spectrum-constrained trajectory error training for mixture density network-based speech synthesis

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 443
  • Download : 0
In statistical parametric speech synthesis, a mixture density network is employed to address the limitations of a linear output layer such as pre-computed fixed variances and the unimodal assumption. However, it also has a defect, i.e., it cannot deploy a static-dynamic constraint needed in the training phase for high-quality speech synthesis. To cope with this problem, this paper proposes a training algorithm based on the minimum trajectory error for a mixture density network. And a modulation spectrum-constrained loss function is also proposed to alleviate the over-smoothing effect. The experimental results confirm meaningful improvement both in objective and subjective performance measures.
Publisher
ACOUSTICAL SOC AMER AMER INST PHYSICS
Issue Date
2018-09
Language
English
Article Type
Article
Citation

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, v.144, no.3, pp.EL151 - EL157

ISSN
0001-4966
DOI
10.1121/1.5052206
URI
http://hdl.handle.net/10203/248313
Appears in Collection
EE-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0