Predictability Shapes Adaptation: An Evolutionary Perspective on Modes of Learning in Transformers
Alexander Ku · Thomas L. Griffiths · Stephanie Chan
Abstract
The success of Transformers lies in their ability to improve inference through two complementary strategies: the permanent refinement of model parameters via _in-weight learning_ (IWL), and the ephemeral modulation of inferences via _in-context learning_ (ICL), which leverages contextual information maintained in the model's activations. Evolutionary biology tells us that the predictability of the environment across timescales predicts the extent to which analogous strategies should be preferred. Genetic _evolution_ adapts to stable environmental features by gradually modifying the genotype over generations. Conversely, environmental volatility favors _plasticity_, which enables a single genotype to express different traits within a lifetime, provided there are reliable cues to guide the adaptation. We operationalize these dimensions (environmental stability and cue reliability) in controlled task settings (sinusoid regression and Omniglot classification) to systematically characterize their influence on learning in Transformers. We find that stable environments favor IWL, often exhibiting a sharp transition when conditions are static. Conversely, reliable cues favor ICL, particularly when the environment is volatile. Furthermore, an analysis of learning dynamics reveals task-dependent transitions between strategies (ICL $\to$ IWL and vice versa). We demonstrate that these transitions are governed by the tension between (1) the asymptotic optimality of the strategy with respect to the environment, and (2) the optimization cost of acquiring that strategy, which depends on the task structure and the learner's inductive bias.
Successful Page Load