Dealing with Missing Modalities in the Visual Question Answer-Difference Prediction Task through Knowledge Distillation

Cited 8 time in webofscience Cited 0 time in scopus
  • Hit : 137
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorCho, Jae Wonko
dc.contributor.authorKim, Dong-Jinko
dc.contributor.authorChoi, Jinsooko
dc.contributor.authorJung, Yunjaeko
dc.contributor.authorKweon, In-Soko
dc.date.accessioned2023-09-05T12:01:14Z-
dc.date.available2023-09-05T12:01:14Z-
dc.date.created2023-09-05-
dc.date.issued2021-06-
dc.identifier.citation2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)-
dc.identifier.issn2160-7508-
dc.identifier.urihttp://hdl.handle.net/10203/312241-
dc.description.abstractIn this work, we address the issues of the missing modalities that have arisen from the Visual Question Answer-Difference prediction task and find a novel method to solve the task at hand. We address the missing modality-the ground truth answers-that are not present at test time and use a privileged knowledge distillation scheme to deal with the issue of the missing modality. In order to efficiently do so, we first introduce a model, the "Big" Teacher, that takes the image/question/answer triplet as its input and outperforms the baseline, then use a combination of models to distill knowledge to a target network (student) that only takes the image/question pair as its inputs. We experiment our models on the VizWiz and VQA-V2 Answer Difference datasets and show through extensive experimentation and ablation the performance of our method and a diverse possibility for future research.-
dc.languageEnglish-
dc.publisherIEEE-
dc.titleDealing with Missing Modalities in the Visual Question Answer-Difference Prediction Task through Knowledge Distillation-
dc.typeConference-
dc.identifier.wosid000705890201074-
dc.identifier.scopusid2-s2.0-85112519740-
dc.type.rimsCONF-
dc.citation.publicationname2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)-
dc.identifier.conferencecountryUS-
dc.identifier.conferencelocationNashville, TN-
dc.identifier.doi10.1109/cvprw53098.2021.00175-
dc.contributor.localauthorKweon, In-So-
Appears in Collection
EE-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 8 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0