Poster
in
Workshop: Deep Generative Model in Machine Learning: Theory, Principle and Efficacy

On the Power of Context Enhanced Learning in LLMs

Xingyu Zhu · Abhishek Panigrahi · Sanjeev Arora

Keywords: Multi-step reasoning In-Context Learning Privileged Information Sample Efficiency Knowledge Internalization

Project Page [ OpenReview]

Abstract

We formalize a new concept for LLMs, context-enhanced learning. It involves standard gradient-based learning on text except that the context is enhanced with additional data on which no auto-regressive gradients are computed. This setting is a gradient-based analog of usual in-context learning (ICL) and appears in some recent works.Using a multi-step reasoning task, we prove in a simplified setting that context-enhanced learning can be exponentially more sample-efficient than standard learning when the model is capable of ICL. At a mechanistic level, we find that the benefit of context-enhancement arises from a more accurate gradient learning signal. Our findings highlight the potential of context-enhanced learning to bridge gradient-based learning and in-context learning, offering new insights into their interplay.

Chat is not available.