DSpace at KOASAS: Seamless equal accuracy ratio for inclusive CTC speech recognition

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Journal Papers(저널논문)

Seamless equal accuracy ratio for inclusive CTC speech recognition

Cited 5 time in

Cited 0 time in

Hit : 252
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Gao, Heting	ko
dc.contributor.author	Wang, Xiaoxuan	ko
dc.contributor.author	Kang, Sunghun	ko
dc.contributor.author	Mina, Rusty	ko
dc.contributor.author	Issa, Dias	ko
dc.contributor.author	Harvill, John	ko
dc.contributor.author	Sari, Leda	ko
dc.contributor.author	Hasegawa-Johnson, Mark	ko
dc.contributor.author	Yoo, Chang-Dong	ko
dc.date.accessioned	2021-12-25T06:40:11Z	-
dc.date.available	2021-12-25T06:40:11Z	-
dc.date.created	2021-12-07	-
dc.date.created	2021-12-07	-
dc.date.created	2021-12-07	-
dc.date.created	2021-12-07	-
dc.date.issued	2022-01	-
dc.identifier.citation	SPEECH COMMUNICATION, v.136, pp.76 - 83	-
dc.identifier.issn	0167-6393	-
dc.identifier.uri	http://hdl.handle.net/10203/291261	-
dc.description.abstract	Concerns have been raised regarding performance disparity in automatic speech recognition (ASR) systems as they provide unequal transcription accuracy for different user groups defined by different attributes that include gender, dialect, and race. In this paper, we propose “equal accuracy ratio”, a novel inclusiveness measure for ASR systems that can be seamlessly integrated into the standard connectionist temporal classification (CTC) training pipeline of an end-to-end neural speech recognizer to increase the recognizer’s inclusiveness. We also create a novel multi-dialect benchmark dataset to study the inclusiveness of ASR, by combining data from existing corpora in seven dialects of English (African American, General American, Latino English, British English, Indian English, Afrikaaner English, and Xhosa English). Experiments on this multi-dialect corpus show that using the equal accuracy ratio as a regularization term along with CTC loss, succeeds in lowering the accuracy gap between user groups and reduces the recognition error rate compared with a non-regularized baseline. Experiments on additional speech corpora that have different user groups also confirm our findings.	-
dc.language	English	-
dc.publisher	ELSEVIER	-
dc.title	Seamless equal accuracy ratio for inclusive CTC speech recognition	-
dc.type	Article	-
dc.identifier.wosid	000789483100004	-
dc.identifier.scopusid	2-s2.0-85121444596	-
dc.type.rims	ART	-
dc.citation.volume	136	-
dc.citation.beginningpage	76	-
dc.citation.endingpage	83	-
dc.citation.publicationname	SPEECH COMMUNICATION	-
dc.identifier.doi	10.1016/j.specom.2021.11.004	-
dc.contributor.localauthor	Yoo, Chang-Dong	-
dc.contributor.nonIdAuthor	Gao, Heting	-
dc.contributor.nonIdAuthor	Wang, Xiaoxuan	-
dc.contributor.nonIdAuthor	Mina, Rusty	-
dc.contributor.nonIdAuthor	Harvill, John	-
dc.contributor.nonIdAuthor	Sari, Leda	-
dc.contributor.nonIdAuthor	Hasegawa-Johnson, Mark	-
dc.description.isOpenAccess	N	-
dc.type.journalArticle	Article	-
dc.subject.keywordAuthor	Speech recognition	-
dc.subject.keywordAuthor	Fairness	-

Appears in Collection: EE-Journal Papers(저널논문)

Files in This Item: There are no files associated with this item.

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 5 items in WoS	Click to see citing articles in

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Seamless equal accuracy ratio for inclusive CTC speech recognition

This item is cited by other documents in WoS

KOASAS

Communities & Collections