DSpace at KOASAS: Improving speech emotion recognition by fusing self-supervised learning and spectral features via mixture of experts

DSpace at KOASAS

College of Engineering(공과대학)School of Computing(전산학부)CS-Journal Papers(저널논문)

Improving speech emotion recognition by fusing self-supervised learning and spectral features via mixture of experts

Cited 0 time in webofscience

Cited 0 time in

Hit : 52
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Hyeon, Jonghwan	ko
dc.contributor.author	Oh, Yung-Hwan	ko
dc.contributor.author	Lee, Young-Jun	ko
dc.contributor.author	Choi, Ho-Jin	ko
dc.date.accessioned	2024-07-01T09:00:09Z	-
dc.date.available	2024-07-01T09:00:09Z	-
dc.date.created	2024-06-25	-
dc.date.issued	2024-03	-
dc.identifier.citation	DATA & KNOWLEDGE ENGINEERING, v.150	-
dc.identifier.issn	0169-023X	-
dc.identifier.uri	http://hdl.handle.net/10203/320087	-
dc.description.abstract	Speech Emotion Recognition (SER) is an important area of research in speech processing that aims to identify and classify emotional states conveyed through speech signals. Recent studies have shown considerable performance in SER by exploiting deep contextualized speech representations from self-supervised learning (SSL) models. However, SSL models pre-trained on clean speech data may not perform well on emotional speech data due to the domain shift problem. To address this problem, this paper proposes a novel approach that simultaneously exploits an SSL model and a domain-agnostic spectral feature (SF) through the Mixture of Experts (MoE) technique. The proposed approach achieves the state-of-the-art performance on weighted accuracy compared to other methods in the IEMOCAP dataset. Moreover, this paper demonstrates the existence of the domain shift problem of SSL models in the SER task.	-
dc.language	English	-
dc.publisher	ELSEVIER	-
dc.title	Improving speech emotion recognition by fusing self-supervised learning and spectral features via mixture of experts	-
dc.type	Article	-
dc.identifier.wosid	001146036900001	-
dc.identifier.scopusid	2-s2.0-85185881711	-
dc.type.rims	ART	-
dc.citation.volume	150	-
dc.citation.publicationname	DATA & KNOWLEDGE ENGINEERING	-
dc.identifier.doi	10.1016/j.datak.2023.102262	-
dc.contributor.localauthor	Oh, Yung-Hwan	-
dc.contributor.localauthor	Choi, Ho-Jin	-
dc.description.isOpenAccess	N	-
dc.type.journalArticle	Article	-
dc.subject.keywordAuthor	Speech emotion recognition	-
dc.subject.keywordAuthor	Self-supervised learning	-
dc.subject.keywordAuthor	Domain shift	-
dc.subject.keywordAuthor	Spectral feature	-

Appears in Collection: CS-Journal Papers(저널논문)

Files in This Item: There are no files associated with this item.

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Improving speech emotion recognition by fusing self-supervised learning and spectral features via mixture of experts

KOASAS

Communities & Collections