DSpace at KOASAS: Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation

DSpace at KOASAS

College of Engineering(공과대학)Kim Jaechul Graduate School of AI(김재철AI대학원)AI-Conference Papers(학술대회논문)

Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation

Cited 0 time in webofscience

Cited 0 time in

Hit : 46
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Jang, Kangwook	ko
dc.contributor.author	Kim, Sungnyun	ko
dc.contributor.author	Yun, Seyoung	ko
dc.contributor.author	Kim, Hoi-Rin	ko
dc.date.accessioned	2023-11-22T07:02:17Z	-
dc.date.available	2023-11-22T07:02:17Z	-
dc.date.created	2023-11-22	-
dc.date.issued	2023-08-21	-
dc.identifier.citation	24th International Speech Communication Association, Interspeech 2023, pp.316 - 320	-
dc.identifier.uri	http://hdl.handle.net/10203/315050	-
dc.description.abstract	Transformer-based speech self-supervised learning (SSL) models, such as HuBERT, show surprising performance in various speech processing tasks. However, huge number of parameters in speech SSL models necessitate the compression to a more compact model for wider usage in academia or small companies. In this study, we suggest to reuse attention maps across the Transformer layers, so as to remove key and query parameters while retaining the number of layers. Furthermore, we propose a novel masking distillation strategy to improve the student model's speech representation quality. We extend the distillation loss to utilize both masked and unmasked speech frames to fully leverage the teacher model's high-quality representation. Our universal compression strategy yields the student model that achieves phoneme error rate (PER) of 7.72% and word error rate (WER) of 9.96% on the SUPERB benchmark.	-
dc.language	English	-
dc.publisher	International Speech Communication Association	-
dc.title	Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation	-
dc.type	Conference	-
dc.identifier.scopusid	2-s2.0-85171584650	-
dc.type.rims	CONF	-
dc.citation.beginningpage	316	-
dc.citation.endingpage	320	-
dc.citation.publicationname	24th International Speech Communication Association, Interspeech 2023	-
dc.identifier.conferencecountry	IE	-
dc.identifier.conferencelocation	Dublin	-
dc.contributor.localauthor	Yun, Seyoung	-
dc.contributor.localauthor	Kim, Hoi-Rin	-

Appears in Collection: AI-Conference Papers(학술대회논문)EE-Conference Papers(학술회의논문)

Files in This Item: There are no files associated with this item.

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation

KOASAS

Communities & Collections