CAI Logo

Explaining Disagreement in VQA with Eye Tracking


Description: When presented with the same question about an image, human annotators often give valid but disagreeing answers indicating that their reasoning was different. In previous work[1], we found that visual attention maps provide insight into the different reasoning underlying disagreement in VQA. There are two possible extensions of this work: Conducting an annotation study to systematically find relevant cases of disagreement and extending the work to other datasets. For the former, it would be required to build an annotation interface, filter candidate case (for example removing disagreement caused by synonyms or spelling mistakes), conducting the annotation study, and validating the collected dataset. For the latter, there are two datasets that could be explored: Byproducts[2] and SalChartQA[3]. For both, the work would require a manual analysis step to see whether there are examples of disagreement caused by differences in visual attention, and then an exploration of how to detect them.

Supervisor: Susanne Hindennach and Yao Wang

Distribution: 20% Literature, 60% implementation, 20% Analysis and discussion

Requirements: Depends on type of extension

Literature: [1] Susanne Hindennach, Lei Shi, and Andreas Bulling. 2024. Explaining Disagreement in Visual Question Answering Using Eye Tracking. In 2024 Symposium on Eye Tracking Research and Applications (ETRA ’24), June 4–7, 2024, Glasgow, United Kingdom. ACM, New York, NY, USA, 7 pages. Paper link.

[2] D. Han, J. Choe, S. Chun, J. J. Y. Chung, M. Chang, S. Yun, J. Y. Song, and S. J. Oh. Neglected free lunch – learning image classifiers using annotation byproducts. In International Conference on Computer Vision (ICCV), 2023. Paper link.

[3]Yao Wang, Weitian Wang, Abdullah Abdelhafez, Mayar Elfares, Zhiming Hu, Mihai Bâce, Andreas Bulling. 2024. SalChartQA: Question-driven Saliency on Information Visualisations. In Proc. ACM SIGCHI Conference on Human Factors in Computing Systems (CHI), pp. 1–14, 2024. Paper link.