Skip to yearly menu bar Skip to main content


Visuals Lie, Consistency Speaks: Disentangling Spatial Attention from Reliability in Vision-Language Models

Logan Mann ⋅ Yi Xia ⋅ Ajit Saravanan ⋅ Ishan Dave ⋅ Saadullah ismail ⋅ Shikhar Shiromani ⋅ Emily Huang ⋅ Ruizhe Li ⋅ Kevin Zhu

Abstract

Chat is not available.