Nonlinear feature extraction techniques for understanding sensory data with application to speech = 감각 정보 이해를 위한 비선형 특징 추출 방법들과 이의 음성 신호에 대한 응용

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 389
  • Download : 0
Speech is one of the most preferred way of human communication. Although it can be represented as a one-dimensional time series, the process of its generation and perception is quite nonlinear, and the attempts to model and understand its structure have ended up with limited success. In this thesis, a general framework of understanding sensory data is provided using theories from nonlinear dynamics. More specifically, a number of useful theorems on the recoverability of the dynamical system behind the observed sensory data are presented. Guided by the theoretical results, several recently proposed nonlinear machine learning techniques are improved and applied to speech data to uncover its structure and solve practical problems such as fundamental frequency estimation and speech recognition. Contributions of this thesis can be summarized as follows: First, a theorem on the recoverability of controllable dynamical systems using a delay embedding map is proved by extending Takens` theorem. It is shown that by imposing some constraints on the maximum and minimum periods of the deterministic dynamical system $\It{M}$, one can approximately reconstruct the product manifold of parameters $\It{N}$and the attractors of underlying dynamical systems $\It{M}$ given only one-dimensional observation. This opens up the possibility of estimating hidden parameters of a sensory signal without having too much domain-specific knowledge on it. The theorem provides some guidance about picking the dimensionality of the delay embedding. Also, a theorem bounding the error of the reconstruction is given. The reconstruction error of the delay embedding map is bounded if we assume a $(D,\delta)$-slow parameter trajectory with $\delta$ small enough, and goes to zero as $\delta \rightarrow 0$. In other words, if the parameter governing the passive dynamical system changes slow enough, then we can reconstruct the product manifold. Second, the method of recovering the controllable dynamical...
Advisors
Lee, Soo-Youngresearcher이수영researcher
Description
한국과학기술원 : 바이오및뇌공학과,
Publisher
한국과학기술원
Issue Date
2010
Identifier
455335/325007  / 020087061
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 바이오및뇌공학과, 2010.08, [ xiii, 104 p. ]

Keywords

Fundamental Frequency; Deep Learning; Manifold Learning; Dynamical System; Phoneme Recognition; 음소 인식; 기본 주파수; 딥 러닝; 다양체 학습; 동역학계

URI
http://hdl.handle.net/10203/27084
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=455335&flag=dissertation
Appears in Collection
BiS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0