VISTA: Visual-Textual Knowledge Graph Representation Learning

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 85
  • Download : 0
Knowledge graphs represent human knowledge using triplets composed of entities and relations. While most existing knowledge graph embedding methods only consider the structure of a knowledge graph, a few recently proposed multimodal methods utilize images or text descriptions of entities in a knowledge graph. In this paper, we propose visual-textual knowledge graphs (VTKGs), where not only entities but also triplets can be explained using images, and both entities and relations can accompany text descriptions. By compiling visually expressible commonsense knowledge, we construct new benchmark datasets where triplets themselves are explained by images, and the meanings of entities and relations are described using text. We propose VISTA, a knowledge graph representation learning method for VTKGs, which incorporates the visual and textual representations of entities and relations using entity encoding, relation encoding, and triplet decoding transformers. Experiments show that VISTA outperforms state-of-the-art knowledge graph completion methods in real-world VTKGs.
Publisher
Association for Computational Linguistics
Issue Date
2023-12-09
Language
English
Citation

The 2023 Conference on Empirical Methods in Natural Language Processing, pp.7314 - 7328

DOI
10.18653/v1/2023.findings-emnlp.488
URI
http://hdl.handle.net/10203/316861
Appears in Collection
CS-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0