Oral
in
Workshop: ICLR 2025 Workshop on Bidirectional Human-AI Alignment

The Lock-in Hypothesis: Stagnation by Algorithm

Project Page [ OpenReview]

Abstract

The training and deployment of large language models (LLMs) induce a feedback loop: models continually learn human beliefs from data, reinforce user beliefs with generated content, re-absorb those reinforced beliefs, and again feed them back to users, creating dynamics resembling an echo chamber. We articulate the hypothesis that the feedback loop with LLMs entrench the existing values and factual beliefs of users, leading to diversity loss and potentially the lock-in of ideas. Prompted by observations of diversity loss in real-world ChatGPT usage data, we study the lock-in hypothesis through data mining, agent-based LLM simulation, and formal modeling.

Chat is not available.