Oral
in
Workshop: ICLR 2025 Workshop on Bidirectional Human-AI Alignment
The Lock-in Hypothesis: Stagnation by Algorithm
Abstract:
The training and deployment of large language models (LLMs) induce a feedback loop: models continually learn human beliefs from data, reinforce user beliefs with generated content, re-absorb those reinforced beliefs, and again feed them back to users, creating dynamics resembling an echo chamber. We articulate the hypothesis that the feedback loop with LLMs entrench the existing values and factual beliefs of users, leading to diversity loss and potentially the lock-in of ideas. Prompted by observations of diversity loss in real-world ChatGPT usage data, we study the lock-in hypothesis through data mining, agent-based LLM simulation, and formal modeling.
Chat is not available.
Successful Page Load