Skip to yearly menu bar Skip to main content


Poster

Image Clustering Conditioned on Text Criteria

Sehyun Kwon · Jaeseung Park · Minkyu Kim · Jaewoong Cho · Ernest K Ryu · Kangwook Lee

Halle B #45
[ ]
Thu 9 May 7:30 a.m. PDT — 9:30 a.m. PDT

Abstract: Classical clustering methods do not provide users with direct control of the clustering results, and the clustering results may not be consistent with the relevant criterion that a user has in mind. In this work, we present a new methodology for performing image clustering based on user-specified criteria in the form of text by leveraging modern Vision-Language Models and Large Language Models. We call our method Image Clustering Conditioned on Text Criteria (IC$|$TC), and it represents a different paradigm of image clustering. IC$|$TC requires a minimal and practical degree of human intervention and grants the user significant control over the clustering results in return. Our experiments show that IC$|$TC can effectively cluster images with various criteria, such as human action, physical location, or the person's mood, significantly outperforming baselines.

Live content is unavailable. Log in and register to view live content