Skip to yearly menu bar Skip to main content


Oral
in
Workshop: Workshop on Spurious Correlation and Shortcut Learning: Foundations and Solutions

Context Heterogeneity Makes In-Context Knowledge Conflicts Harder to Detect

Chen Wu · Neil Kale · Aditi Raghunathan

Keywords: [ multimodal language models ] [ multilingual language models ] [ Knowledge conflict ]


Abstract:

Language models (LMs) deployed in real-world tasks -- such as medical report synthesis, web navigation, and summarization -- must process diverse inputs and handle conflicting information. Users expect them to detect in-context knowledge conflicts -- direct contradictions about objective facts -- and issue alerts. Yet, we find a critical failure: when faced with conflicting evidence across heterogeneous contexts, such as multiple languages or modalities, LMs fail to detect conflicts, leaving them vulnerable to attacks and misinformation. While they achieve near-perfect accuracy in homogeneous contexts, this drops by up to 65% in heterogeneous settings. We identify context imbalance as the root cause: LMs exhibit extreme attention asymmetry across domains, disproportionately prioritizing certain domains in mixed inputs. Current instruction-tuning, which trains on separate examples from multiple domains, fails to correct this. To address this, we need instance-level diverse points that require reasoning over multiple domains within a single context. We introduce Heterogeneous Instruction-Tuning (HeteroIT), a scalable dataset-mixing procedure that generates instance-level diversity by combining datasets from different domains. Applying HeteroIT to Bactrian-X, a standard multilingual instruction-tuning dataset, improves conflict detection by 37%.

Chat is not available.