Can CLIP Share Image in Dialogue?

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 46
  • Download : 0
Recently, many studies have constructed multimodal dialogue datasets containing image-sharing behavior, which is vital to increase the social relationship with interlocutors in open-domain conversation. In this paper, we report the empirical results that CLIP can understand the alignment between the dialogue history and image by conducting various experiments for (1) zero-shot transferability, (2) the effect of dialogue history, and (3) robustness. Our experiments demonstrate that it is necessary for improving the zero-shot performance of CLIP on the multi-modal dialogue dataset. Additionally, the CLIP model is benefitted from more informative texts (i.e., dialogue history), not the last utterance only.
Publisher
Institute of Electrical and Electronics Engineers Inc.
Issue Date
2023-02-13
Language
English
Citation

2023 IEEE International Conference on Big Data and Smart Computing, BigComp 2023, pp.410 - 412

DOI
10.1109/BigComp57234.2023.00101
URI
http://hdl.handle.net/10203/308692
Appears in Collection
CS-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0