Noise-Aware Generalization: Robustness to In-Domain Noise and Out-of-Domain Generalization
Abstract
Methods addressing Learning with Noisy Labels (LNL) and multi-source Domain Generalization (DG) use training techniques to improve downstream task performance in the presence of label noise or domain shifts, respectively. Prior work often explores these tasks in isolation, with only limited work that evaluates how label noise affects existing DG methods without also exploring methods to reduce its effect. However, many applications require methods that are robust to both label noise and distribution shifts, which we refer to as Noise-Aware Generalization (NAG), and when these problems are considered together new challenges emerge. E.g., most LNL methods identify noise by detecting distribution shifts in a class’s samples, i.e., they assume that distribution shifts often correspond to label noise. In NAG distribution shifts can be due to label noise or domain shifts, breaking the assumptions used by LNL methods. DG methods often overlook the effect of label noise entirely, which can confuse a model during training, reducing performance. A naive solution to this issue is to make a similar assumption as many DG methods, where we presume to have domain labels during training, enabling us to isolate the two types of shifts. However, this ignores valuable cross-domain information. Specifically, our proposed DL4ND approach improves noise detection by taking advantage of the observation that noisy samples that may appear indistinguishable within a single domain often show greater variation when compared across domains. Experiments show DL4ND significantly improves performance across seven diverse datasets, offering a promising direction for tackling NAG.