Spectrum Tuning: Post-Training for Distributional Coverage and In-Context Steerability
Taylor Sorensen · Benjamin Newman · Jared Moore · Chan Young Park · Jillian Fisher · Niloofar Mireshghallah · Liwei Jiang · Yejin Choi
Abstract
Language model post-training has enhanced instruction-following and performance on many downstream tasks, but also comes with an often-overlooked cost on tasks with many possible valid answers. We characterize three desiderata: in-context steerability, valid output space coverage, and distributional alignment, and document across three model families how post-training can reduce these properties. In particular, we disambiguate between two kinds of in-context learning: ICL for eliciting existing underlying knowledge or capabilities, and in-context steerability, where a model must use in-context information to override its priors and steer to a novel data generating distribution. To better evaluate and improve these desiderata, we introduce Spectrum Suite, a large-scale resource compiled from $>40$ data sources and spanning $>90$ tasks requiring models to steer to and match diverse distributions. We find that while instruction-tuning helps elicit underlying capabilities and models, it hurts a model’s ability to flexibly steer in-context. To mitigate these issues, we propose Spectrum Tuning, a post-training method using Spectrum Suite to improve steerability and distributional coverage. We find that Spectrum Tuning often improves over pretrained models and their instruction-tuned counterparts, enhancing steerability, spanning more of the out- put space, and improving distributional alignment on held-out datasets.
Successful Page Load