DSpace at KOASAS: Eigenvoice reconstruction for rapid speaker adaptation

DSpace at KOASAS

College of Engineering(공과대학)School of Computing(전산학부)CS-Theses_Ph.D.(박사논문)

Eigenvoice reconstruction for rapid speaker adaptationEigenvoice 재구성 기법을 이용한 고속 화자 적응

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 551
Download : 0

Export

Choi, Dong-Jin / 최동진

Speech recognition is considered as one of the most natural activities in man-machine interaction. Many studies have attempted to provide low error rates for speakers with various characteristics, and speech recognition systems have recently achieved increasingly good performance. However, a speaker-dependent (SD) system generally outperforms a speaker-independent (SI) system when tested on the same speaker. Nevertheless, SI systems are more commonly found in real applications because a large amount of training data is required in SD systems. Speaker adaptation is a technique of producing a system suitable for a specific speaker from an SI system using a small amount of adaptation data for the speaker. Nowadays, researchers are concerned with rapid speaker adaptation, which is a technique of speaker adaptation using a small amount of data, around 30 seconds or less, since the range of applications that cannot request a long speech sample for adaptation data has been growing. Speaker adaptation in eigenvoice space is a popular method for rapid speaker adaptation. This technique constrains the adapted model to a linear combination of a small number of basis vectors, eigenvoices, obtained from a set of reference speakers, thereby reducing the number of free parameters to be estimated. This eigenvoice adaptation method shows good performance given a very small amount of adaptation data, but it has some problems. One drawback of the technique is that the recognition rate of the adapted model reaches a plateau quite quickly. This is because the number of free parameters is too small to generate a sophisticated model, but overfitting may occur when a model has too many free parameters in relation to the amount of adaptation data. To solve this problem, a method is needed to control the number of free parameters according to amount of adaptation data. In this thesis, we propose speaker adaptation using structural eigenvoices. In this method, we can decide the number ...

Advisors: Oh, Yung-Hwan researcher; 오영환 researcher

Description: 한국과학기술원 : 전산학과,

Publisher: 한국과학기술원

Issue Date: 2010

Identifier: 455451/325007 / 000995370

Language: eng

Description: 학위논문(박사) - 한국과학기술원 : 전산학과, 2010.08, [ viii, 66 p. ]

Keywords: speech recognition; speaker adaptation; eigenvoice; 모델 병합; 음성 인식; 화자 적응

URI: http://hdl.handle.net/10203/33323

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=455451&flag=dissertation

Appears in Collection: CS-Theses_Ph.D.(박사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Eigenvoice reconstruction for rapid speaker adaptationEigenvoice 재구성 기법을 이용한 고속 화자 적응

KOASAS

Communities & Collections