DSpace at KOASAS: Gradient Ascent Post-training Enhances Language Model Generalization

DSpace at KOASAS

College of Engineering(공과대학)Kim Jaechul Graduate School of AI(김재철AI대학원)AI-Conference Papers(학술대회논문)

Gradient Ascent Post-training Enhances Language Model Generalization

Cited 0 time in webofscience

Cited 0 time in

Hit : 35
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Yoon, Dongkeun	ko
dc.contributor.author	Jang, Joel	ko
dc.contributor.author	Kim, Sungdong	ko
dc.contributor.author	Seo, Minjoon	ko
dc.date.accessioned	2023-12-12T09:00:59Z	-
dc.date.available	2023-12-12T09:00:59Z	-
dc.date.created	2023-12-09	-
dc.date.created	2023-12-09	-
dc.date.issued	2023-07	-
dc.identifier.citation	ACL 2023, pp.851 - 864	-
dc.identifier.uri	http://hdl.handle.net/10203/316299	-
dc.description.abstract	In this work, we empirically show that updating pretrained LMs (350M, 1.3B, 2.7B) with just a few steps of Gradient Ascent Post-training (GAP) on random, unlabeled text corpora enhances its zero-shot generalization capabilities across diverse NLP tasks. Specifically, we show that GAP can allow LMs to become comparable to 2-3x times larger LMs across 12 different NLP tasks. We also show that applying GAP on out-of-distribution corpora leads to the most reliable performance improvements. Our findings indicate that GAP can be a promising method for improving the generalization capability of LMs without any task-specific fine-tuning	-
dc.language	English	-
dc.publisher	Association for Computational Linguistics (ACL)	-
dc.title	Gradient Ascent Post-training Enhances Language Model Generalization	-
dc.type	Conference	-
dc.identifier.scopusid	2-s2.0-85172257505	-
dc.type.rims	CONF	-
dc.citation.beginningpage	851	-
dc.citation.endingpage	864	-
dc.citation.publicationname	ACL 2023	-
dc.identifier.conferencecountry	CN	-
dc.identifier.conferencelocation	Toronto	-
dc.contributor.localauthor	Seo, Minjoon	-
dc.contributor.nonIdAuthor	Yoon, Dongkeun	-

Appears in Collection: AI-Conference Papers(학술대회논문)

Files in This Item: There are no files associated with this item.

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Gradient Ascent Post-training Enhances Language Model Generalization

KOASAS

Communities & Collections