DSpace at KOASAS: Acoustic Model Combination Incorporated With Mask-Based Multi-Channel Source Separation for Automatic Speech Recognition

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Journal Papers(저널논문)

Acoustic Model Combination Incorporated With Mask-Based Multi-Channel Source Separation for Automatic Speech Recognition

Cited 3 time in

Cited 1 time in

Hit : 632
Download : 690

Export

DC Field	Value	Language
dc.contributor.author	Yoon, JS	ko
dc.contributor.author	Park, JH	ko
dc.contributor.author	Kim, HK	ko
dc.contributor.author	Kim, HoiRin	ko
dc.date.accessioned	2011-03-14T08:03:56Z	-
dc.date.available	2011-03-14T08:03:56Z	-
dc.date.created	2012-02-06	-
dc.date.created	2012-02-06	-
dc.date.issued	2010-10	-
dc.identifier.citation	IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, v.4, no.5, pp.772 - 784	-
dc.identifier.issn	1932-4553	-
dc.identifier.uri	http://hdl.handle.net/10203/22631	-
dc.description.abstract	In this paper, we propose an acoustic model combination (AMC) technique for reducing a mismatch between training and testing conditions of an automatic speech recognition (ASR) system in a multi-channel noisy environment. In our previous work, we proposed a hidden Markov model (HMM)-based mask estimation method for multi-channel source separation using two microphones, where HMMs were adopted for mask estimation in order to incorporate an observation that the mask information should be correlated over contiguous analysis frames. However, it was observed that a certain degree of noise still remained in the separated speech source especially under low signal-to-noise ratio (SNR) conditions. This was because the estimated mask was not ideal, which resulted in limiting the improvement of ASR performance. To mitigate this problem, the remaining noise can be further compensated in the acoustic model domain under a framework of parallel model combination (PMC). In particular, a noise model and a weighting factor for the proposed AMC can be estimated from the remaining noise and the average of the relative magnitude of the mask, respectively. It is shown from the experiments that an ASR system employing the proposed AMC technique achieves a relative average word error rate (WER) reduction of 56.91%, when compared to a system using the mask-based source separation alone. In addition, compared to a conventional PMC implemented with a log-normal approximation, the proposed AMC relatively reduces WER by 43.64%.	-
dc.language	English	-
dc.language.iso	en_US	en
dc.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC	-
dc.subject	NOISE	-
dc.title	Acoustic Model Combination Incorporated With Mask-Based Multi-Channel Source Separation for Automatic Speech Recognition	-
dc.type	Article	-
dc.identifier.wosid	000283266800002	-
dc.identifier.scopusid	2-s2.0-77956739077	-
dc.type.rims	ART	-
dc.citation.volume	4	-
dc.citation.issue	5	-
dc.citation.beginningpage	772	-
dc.citation.endingpage	784	-
dc.citation.publicationname	IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING	-
dc.identifier.doi	10.1109/JSTSP.2010.2057196	-
dc.embargo.liftdate	9999-12-31	-
dc.embargo.terms	9999-12-31	-
dc.contributor.localauthor	Kim, HoiRin	-
dc.contributor.nonIdAuthor	Yoon, JS	-
dc.contributor.nonIdAuthor	Park, JH	-
dc.contributor.nonIdAuthor	Kim, HK	-
dc.type.journalArticle	Article	-
dc.subject.keywordAuthor	Computational auditory scene analysis (CASA)	-
dc.subject.keywordAuthor	mask estimation	-
dc.subject.keywordAuthor	mask-based noise model estimation	-
dc.subject.keywordAuthor	mask-based weighting factor estimation	-
dc.subject.keywordAuthor	multi-channel source separation(MCSS)	-
dc.subject.keywordAuthor	parallel model combination	-
dc.subject.keywordAuthor	speech recognition	-
dc.subject.keywordPlus	NOISE	-

Appears in Collection: EE-Journal Papers(저널논문)

Files in This Item

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 3 items in WoS	Click to see citing articles in

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Acoustic Model Combination Incorporated With Mask-Based Multi-Channel Source Separation for Automatic Speech Recognition

This item is cited by other documents in WoS

KOASAS

Communities & Collections