Oral
in
Workshop: ICLR 2025 Workshop on Bidirectional Human-AI Alignment

Representational Difference Clustering

Project Page [ OpenReview]

Abstract

Current computer vision models are approaching superhuman performance on visual categorization tasks in domains such as ecology, radiology, etc. Explainable AI (XAI) methods aim to explain how such models make decisions. Unfortunately, in order to make explanations that are human-friendly, XAI methods can often simplify model behavior to the point that critical information is lost. For humans to learn how model's achieve superhuman performance, we must work towards understanding these nuances. In this work, we consider the challenging task of visually explaining the differences between two representations. By nature, this task forces XAI methods to discard coarse-grained, obvious aspects of a model's representation to focus on nuances that make a model unique. To this end, we propose a clustering method that is able to isolate neighborhoods of images that are close together in one representation, but distant in the other. These discovered clusters represent concepts that are present in only one of the two representations. We use our method to compare different model representations and discover semantically meaningful clusters.

Chat is not available.