DSpace at KOASAS: Test-time adaptation for automatic speech recognition via sequential-level generalized entropy minimization

DSpace at KOASAS

College of Engineering(공과대학)Kim Jaechul Graduate School of AI(김재철AI대학원)AI-Theses_Master(석사논문)

Test-time adaptation for automatic speech recognition via sequential-level generalized entropy minimization문장 수준의 일반화된 엔트로피 최소화를 통한 음성 인식 모델에 대한 테스트타임 적응

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 4
Download : 0

Export

DC Field	Value	Language
dc.contributor.advisor	양은호	-
dc.contributor.author	Kim, Changhun	-
dc.contributor.author	김창훈	-
dc.date.accessioned	2024-07-30T19:30:41Z	-
dc.date.available	2024-07-30T19:30:41Z	-
dc.date.issued	2024	-
dc.identifier.uri	http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1096077&flag=dissertation	en_US
dc.identifier.uri	http://hdl.handle.net/10203/321372	-
dc.description	학위논문(석사) - 한국과학기술원 : 김재철AI대학원, 2024.2,[iv, 22 p. :]	-
dc.description.abstract	In real-world scenarios, automatic speech recognition (ASR) models often encounter data distribution shifts, leading to erroneous predictions. To tackle this issue, a recent test-time adaptation (TTA) method has been proposed to adapt the pre-trained ASR model to the unlabeled target domain without source data. Despite decent performance gain, this approach relies solely on naive greedy decoding and performs adaptation across timesteps at the frame level, which may not be optimal given the sequential nature of model outputs. Motivated by this limitation, this thesis introduces a novel Sequential-level Generalized Entropy Minimization (SGEM) framework for general ASR models. To handle sequential output, SGEM first exploits beam search to explore candidate output logits and selects the most plausible one. Then, it utilizes generalized entropy minimization and negative sampling as effective unsupervised objectives to adapt the model. Through extensive experiments, SGEM verifies its state-of-the-art performance across three mainstream ASR models under various distribution shifts.	-
dc.language	eng	-
dc.publisher	한국과학기술원	-
dc.subject	기계 학습▼a음성 인식▼a분포 변화 강건성▼a테스트타임 적응▼a빔 서치▼a엔트로피 최소화▼a네거티브 샘플링	-
dc.subject	Machine learning▼aAutomatic speech recognition▼aDistribution shift robustness▼aTest-time adaptation▼aBeam search▼aEntropy minimization▼aNegative sampling	-
dc.title	Test-time adaptation for automatic speech recognition via sequential-level generalized entropy minimization	-
dc.title.alternative	문장 수준의 일반화된 엔트로피 최소화를 통한 음성 인식 모델에 대한 테스트타임 적응	-
dc.type	Thesis(Master)	-
dc.identifier.CNRN	325007	-
dc.description.department	한국과학기술원 :김재철AI대학원,	-
dc.contributor.alternativeauthor	Yang, Eunho	-

Appears in Collection: AI-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Test-time adaptation for automatic speech recognition via sequential-level generalized entropy minimization문장 수준의 일반화된 엔트로피 최소화를 통한 음성 인식 모델에 대한 테스트타임 적응

KOASAS

Communities & Collections