Counterfactual Mix-Up for Visual Question Answering

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 125
  • Download : 0
Counterfactuals have been shown to be a powerful method in Visual Question Answering in the alleviation of Visual Question Answering's unimodal bias. However, existing counterfactual methods tend to generate samples that are not diverse or require auxiliary models to synthesize additional data. In this regard, we propose a more diverse and simple counterfactual sample synthesis method called Counterfactual Mix-Up (CoMiU), which generates counterfactual image features and questions through batch-wise swapping in local object-and word-level. This method efficiently facilitates the generation of more abundant and diverse counterfactual samples, which help improve the robustness of Visual Question Answering models. Moreover, with the creation of diverse counterfactual samples, we introduce two more robust and stable contrastive loss functions, namely Batch-Contrastive loss and Answer-Contrastive loss. We test our method on various challenging Visual Question Answering robustness testing setups to show the advantages of the proposed method compared with the current state-of-the-art methods.
Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Issue Date
2023
Language
English
Article Type
Article
Citation

IEEE ACCESS, v.11, pp.95201 - 95212

ISSN
2169-3536
DOI
10.1109/ACCESS.2023.3303891
URI
http://hdl.handle.net/10203/312984
Appears in Collection
EE-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0