Poster
in
Workshop: 2nd Workshop on Navigating and Addressing Data Problems for Foundation Models (DATA-FM)

Enhancing Interpretability in Generative AI Through Search-Based Data Influence Analysis

Theodoros Aivalis · Iraklis A. Klampanos · Antonis Troumpoukis · Joemon Jose

Project Page [ OpenReview]

Abstract

Generative AI models offer powerful capabilities but often lack transparency, making it difficult to interpret their output. This is critical in cases involving artistic or copyrighted content. This work introduces a search-inspired approach to improve the interpretability of these models by analysing the influence of training data on their outputs. Our method provides observational interpretability by focusing on a model’s output rather than on its internal state. We consider both raw data and latent-space embeddings when searching for the influence of data itemsin generated content. We evaluate our method by retraining models locally and by demonstrating the method’s ability to uncover influential subsets in the training data. This work lays the groundwork for future extensions, including user-based evaluations with domain experts, which is expected to improve observational interpretability further.

Video

Chat is not available.