Skip to yearly menu bar Skip to main content


Oral
in
Workshop: Workshop on Spurious Correlation and Shortcut Learning: Foundations and Solutions

Generalizing to any diverse distribution: uniformity & rebalancing

Andreas Loukas · Karolis Martinkus · Edward Wagstaff · Kyunghyun Cho

Keywords: [ finetuning ] [ ood generalization ] [ rebalancing ] [ out-of-distribution generalization ] [ diversity ]


Abstract:

As training datasets grow larger, we aspire to develop models that generalize well to any diverse test distribution, even if the latter deviates significantly from the training data. Various approaches like domain adaptation, domain generalization, and robust optimization attempt to address the out-of-distribution challenge by posing assumptions about the relation between training and test distribution. Differently, we adopt a more conservative perspective by accounting for the worst-case error across all sufficiently diverse test distributions within a known domain. Our first finding is that training on a uniform distribution over this domain is optimal. We also interrogate practical remedies when uniform samples are unavailable by considering methods for mitigating non-uniformity through finetuning and rebalancing. Our theory aligns with previous observations on the role of entropy and rebalancing for o.o.d. generalization. We also provide new empirical evidence across tasks involving o.o.d. shifts which show applicability of our perspective.

Chat is not available.