DSpace at KOASAS: Semantically-driven cut-and-paste data augmentation strategy for automatic speech recognition

DSpace at KOASAS

College of Engineering(공과대학)Kim Jaechul Graduate School of AI(김재철AI대학원)AI-Theses_Master(석사논문)

Semantically-driven cut-and-paste data augmentation strategy for automatic speech recognition자동 음성 인식을 위한 의미 중심 컷앤페이스트 데이터 증강 전략

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 4
Download : 0

Export

Seo, Kyusung / 서규성

A data augmentation technique involving cut-and-paste operations has garnered significant interest within the field of computer vision because of its straightforward nature and its proven effectiveness in enhancing the ability to generalize. However, applying this method to Automatic Speech Recognition (ASR) tasks poses challenges due to the varying lengths of segments corresponding to specific output tokens such as words or sub-words. Furthermore, if speech segments are combined without regard for their meaning, there is a risk of generating incoherent or nonsensical sentences. In this paper, we introduce a method called WeavSpeech, which addresses these challenges by offering a straightforward yet powerful cut-and-paste augmentation approach for ASR tasks. WeavSpeech weaves together pairs of speech data while taking into account their semantics. This method is universally applicable to languages without requiring language-specific knowledge and can be seamlessly incorporated with other verified augmentation techniques such as SpecAugment. Our research demonstrates the superiority of WeavSpeech on well-known ASR benchmark datasets, including LibriSpeech and WSJ.

Advisors: 양은호 researcher

Description: 한국과학기술원 :김재철AI대학원,

Publisher: 한국과학기술원

Issue Date: 2024

Identifier: 325007

Language: eng

Description: 학위논문(석사) - 한국과학기술원 : 김재철AI대학원, 2024.2,[iii, 17 p. :]

Keywords: 음성 인식▼a데이터 증강▼a컷앤페이스트▼a컷믹스▼a믹스업; Speech recognition▼aData augmentation▼aCut-and-paste▼aCutmix▼aMixup

URI: http://hdl.handle.net/10203/321356

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1096061&flag=dissertation

Appears in Collection: AI-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Semantically-driven cut-and-paste data augmentation strategy for automatic speech recognition자동 음성 인식을 위한 의미 중심 컷앤페이스트 데이터 증강 전략

KOASAS

Communities & Collections