Skip to yearly menu bar Skip to main content


Oral
in
Workshop: Workshop on Spurious Correlation and Shortcut Learning: Foundations and Solutions

Measuring the Layer-Wise Impact of Image Shortcuts on Deep Model Features

Nikita Tsoy · Nikola Konstantinov

Keywords: [ representation learning ] [ out-of-distribution ] [ shortcuts ] [ deep learning ] [ distribution shift ]


Abstract:

Shortcuts, spurious patterns that perform well only on the training distribution, pose a major challenge to deep network reliability (Geirhos et al., 2020). In this work, we investigate the layer-wise impact of image shortcuts on learned features. First, we propose an experiment design that introduces artificial shortcut-inducing skews during training, enabling a counterfactual analysis of how different layers contribute to shortcut-related accuracy degradation. Next, we use our method to study the effects of a patch-like skew on CNNs trained on CIFAR-10 and CIFAR-100. Our analysis reveals that different types of skews affect networks layers differently: class-universal skews (affecting all instances of a target class) and class-specific skews (affecting only one class) impact deeper layers more than non-universal and non-specific skews, respectively. Additionally, we identify the forgetting of shortcut-free features as a key mechanism behind accuracy drop for our class of skews, indicating the potential role of simplicity bias (Shah et al., 2020) and excessive regularization (Sagawa et al., 2020) in shortcut learning.

Chat is not available.