DC Field | Value | Language |
---|---|---|
dc.contributor.author | Yoon, Dongkeun | ko |
dc.contributor.author | Jang, Joel | ko |
dc.contributor.author | Kim, Sungdong | ko |
dc.contributor.author | Seo, Minjoon | ko |
dc.date.accessioned | 2023-12-12T09:00:59Z | - |
dc.date.available | 2023-12-12T09:00:59Z | - |
dc.date.created | 2023-12-09 | - |
dc.date.created | 2023-12-09 | - |
dc.date.issued | 2023-07 | - |
dc.identifier.citation | ACL 2023, pp.851 - 864 | - |
dc.identifier.uri | http://hdl.handle.net/10203/316299 | - |
dc.description.abstract | In this work, we empirically show that updating pretrained LMs (350M, 1.3B, 2.7B) with just a few steps of Gradient Ascent Post-training (GAP) on random, unlabeled text corpora enhances its zero-shot generalization capabilities across diverse NLP tasks. Specifically, we show that GAP can allow LMs to become comparable to 2-3x times larger LMs across 12 different NLP tasks. We also show that applying GAP on out-of-distribution corpora leads to the most reliable performance improvements. Our findings indicate that GAP can be a promising method for improving the generalization capability of LMs without any task-specific fine-tuning | - |
dc.language | English | - |
dc.publisher | Association for Computational Linguistics (ACL) | - |
dc.title | Gradient Ascent Post-training Enhances Language Model Generalization | - |
dc.type | Conference | - |
dc.identifier.scopusid | 2-s2.0-85172257505 | - |
dc.type.rims | CONF | - |
dc.citation.beginningpage | 851 | - |
dc.citation.endingpage | 864 | - |
dc.citation.publicationname | ACL 2023 | - |
dc.identifier.conferencecountry | CN | - |
dc.identifier.conferencelocation | Toronto | - |
dc.contributor.localauthor | Seo, Minjoon | - |
dc.contributor.nonIdAuthor | Yoon, Dongkeun | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.