DSpace at KOASAS: Generation of Korean Offensive Language by Leveraging Large Language Models via Prompt Design

DSpace at KOASAS

College of Engineering(공과대학)School of Computing(전산학부)CS-Conference Papers(학술회의논문)

Generation of Korean Offensive Language by Leveraging Large Language Models via Prompt Design

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 59
Download : 0

Export

Shin, Jisu / Song, Hoyun / Lee, Huije / Gaim, Fitsum / Park, Jong-Cheol researcher

The research for detecting offensive language on online platforms has much advanced. However, the majority of these studies have primarily focused on English. Given the unique characteristics of offensive language, where social and cultural contexts significantly influence content understanding, language-specific datasets are essential. Acquiring comprehensive datasets in Korean, a less-resourced language, has mostly relied on human annotations, suffering from inherent limitations in terms of labor intensity and potential annotator bias. Automatic generation of datasets using generative methods offers an alternative approach to address these limitations, yet faces challenges in capturing linguistic and cultural diversities while maintaining native-level fluency. To address these challenges, we introduce a prompt design methodology, Korean Offensive language Machine Generation (K-OMG), using large language models. By manipulating three prompt factors, we find an effective prompt design to generate culturally aligned offensive language with fluent expressions. Experimental results demonstrate the high quality and utility of our automatically generated dataset. Our detailed analysis shows that the proposed approach achieves exceptional fluency in generating texts while effectively incorporating social and cultural diversities.

Publisher: Association for Computational Linguistics (ACL)

Issue Date: 2023-11-02

Language: English

Citation: The 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics

URI: http://hdl.handle.net/10203/314595

Appears in Collection: CS-Conference Papers(학술회의논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Generation of Korean Offensive Language by Leveraging Large Language Models via Prompt Design

KOASAS

Communities & Collections