Exploring Mode Connectivity in Krylov Subspace for Domain Generalization
Abstract
This paper explores the geometric characteristics of loss landscapes to enhance domain generalization (DG) in deep neural networks. Existing methods mainly leverage the local flatness around minima for improved generalization. However, recent theoretical studies indicate that flatness does not universally guarantee better generalization. Instead, this paper investigates a global geometrical property for domain generalization, i.e., mode connectivity, the phenomenon where distinct local minima are connected by continuous low-loss pathways. Different from flatness, mode connectivity enables transitions from poor to superior generalization models without leaving low-loss regions. To navigate these connected pathways effectively, this paper proposes a novel Billiard Optimization Algorithm (BOA), which discovers superior models by mimicking billiard dynamics. During this process, BOA operates within a low-dimensional Krylov subspace, aiming to alleviate the curse of dimensionality caused by the high-dimensional parameter space of deep models. Furthermore, this paper reveals that oracle test gradients strongly align with the Krylov subspace constructed from training gradients across diverse datasets and architectures. This alignment offers a powerful tool to bridge training and test domains, enabling the efficient discovery of superior models with limited training domains. Experiments on DomainBed demonstrate that BOA consistently outperforms existing sharpness-aware and DG methods across diverse datasets and architectures. Impressively, BOA even surpasses the sharpness-aware minimization by 3.6\% on VLCS when using a ViT-B/16 backbone.