DSpace at KOASAS: Gradient Ascent Post-training Enhances Language Model Generalization

DSpace at KOASAS

College of Engineering(공과대학)Kim Jaechul Graduate School of AI(김재철AI대학원)AI-Conference Papers(학술대회논문)

Gradient Ascent Post-training Enhances Language Model Generalization

Cited 0 time in webofscience

Cited 0 time in

Hit : 31
Download : 0

Export

Yoon, Dongkeun / Jang, Joel / Kim, Sungdong / Seo, Minjoon researcher

In this work, we empirically show that updating pretrained LMs (350M, 1.3B, 2.7B) with just a few steps of Gradient Ascent Post-training (GAP) on random, unlabeled text corpora enhances its zero-shot generalization capabilities across diverse NLP tasks. Specifically, we show that GAP can allow LMs to become comparable to 2-3x times larger LMs across 12 different NLP tasks. We also show that applying GAP on out-of-distribution corpora leads to the most reliable performance improvements. Our findings indicate that GAP can be a promising method for improving the generalization capability of LMs without any task-specific fine-tuning

Publisher: Association for Computational Linguistics (ACL)

Issue Date: 2023-07

Language: English

Citation: ACL 2023, pp.851 - 864

URI: http://hdl.handle.net/10203/316299

Appears in Collection: AI-Conference Papers(학술대회논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Gradient Ascent Post-training Enhances Language Model Generalization

KOASAS

Communities & Collections