Does it Really Generalize Well on Unseen Data? Systematic Evaluation of Relational Triple Extraction Methods

Cited 1 time in webofscience Cited 0 time in scopus
  • Hit : 245
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorLee, Juhyukko
dc.contributor.authorLee, Min-Joongko
dc.contributor.authorYang, June Yongko
dc.contributor.authorYang, Eunhoko
dc.date.accessioned2022-12-06T01:00:45Z-
dc.date.available2022-12-06T01:00:45Z-
dc.date.created2022-12-04-
dc.date.issued2022-07-12-
dc.identifier.citation2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022, pp.3849 - 3858-
dc.identifier.urihttp://hdl.handle.net/10203/301719-
dc.description.abstractThe ability to extract entities and their relations from unstructured text is essential for the automated maintenance of large-scale knowledge graphs. To keep a knowledge graph up-to-date, an extractor needs not only the ability to recall the triples it encountered during training, but also the ability to extract the new triples from the context that it has never seen before. In this paper, we show that although existing extraction models are able to easily memorize and recall already seen triples, they cannot generalize effectively for unseen triples. This alarming observation was previously unknown due to the composition of the test sets of the go-to benchmark datasets, which turns out to contain only 2% unseen data, rendering them incapable to measure the generalization performance. To separately measure the generalization performance from the memorization performance, we emphasize unseen data by rearranging datasets, sifting out training instances, or augmenting test sets. In addition to that, we present a simple yet effective augmentation technique to promote generalization of existing extraction models, and experimentally confirm that the proposed method can significantly increase the generalization performance of existing models.-
dc.languageEnglish-
dc.publisherAssociation for Computational Linguistics-
dc.titleDoes it Really Generalize Well on Unseen Data? Systematic Evaluation of Relational Triple Extraction Methods-
dc.typeConference-
dc.identifier.wosid000859869503073-
dc.identifier.scopusid2-s2.0-85138385914-
dc.type.rimsCONF-
dc.citation.beginningpage3849-
dc.citation.endingpage3858-
dc.citation.publicationname2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022-
dc.identifier.conferencecountryUS-
dc.identifier.conferencelocationSeattle-
dc.identifier.doi10.18653/v1/2022.naacl-main.282-
dc.contributor.localauthorYang, Eunho-
dc.contributor.nonIdAuthorLee, Juhyuk-
dc.contributor.nonIdAuthorLee, Min-Joong-
Appears in Collection
AI-Conference Papers(학술대회논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 1 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0