Eigenvoice reconstruction for rapid speaker adaptationEigenvoice 재구성 기법을 이용한 고속 화자 적응

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 551
  • Download : 0
Speech recognition is considered as one of the most natural activities in man-machine interaction. Many studies have attempted to provide low error rates for speakers with various characteristics, and speech recognition systems have recently achieved increasingly good performance. However, a speaker-dependent (SD) system generally outperforms a speaker-independent (SI) system when tested on the same speaker. Nevertheless, SI systems are more commonly found in real applications because a large amount of training data is required in SD systems. Speaker adaptation is a technique of producing a system suitable for a specific speaker from an SI system using a small amount of adaptation data for the speaker. Nowadays, researchers are concerned with rapid speaker adaptation, which is a technique of speaker adaptation using a small amount of data, around 30 seconds or less, since the range of applications that cannot request a long speech sample for adaptation data has been growing. Speaker adaptation in eigenvoice space is a popular method for rapid speaker adaptation. This technique constrains the adapted model to a linear combination of a small number of basis vectors, eigenvoices, obtained from a set of reference speakers, thereby reducing the number of free parameters to be estimated. This eigenvoice adaptation method shows good performance given a very small amount of adaptation data, but it has some problems. One drawback of the technique is that the recognition rate of the adapted model reaches a plateau quite quickly. This is because the number of free parameters is too small to generate a sophisticated model, but overfitting may occur when a model has too many free parameters in relation to the amount of adaptation data. To solve this problem, a method is needed to control the number of free parameters according to amount of adaptation data. In this thesis, we propose speaker adaptation using structural eigenvoices. In this method, we can decide the number ...
Advisors
Oh, Yung-Hwanresearcher오영환researcher
Description
한국과학기술원 : 전산학과,
Publisher
한국과학기술원
Issue Date
2010
Identifier
455451/325007  / 000995370
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전산학과, 2010.08, [ viii, 66 p. ]

Keywords

speech recognition; speaker adaptation; eigenvoice; 모델 병합; 음성 인식; 화자 적응

URI
http://hdl.handle.net/10203/33323
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=455451&flag=dissertation
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0