Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Deep Generative Model in Machine Learning: Theory, Principle and Efficacy

On the Power of Context Enhanced Learning in LLMs

Xingyu Zhu · Abhishek Panigrahi · Sanjeev Arora

Keywords: [ Multi-step reasoning ] [ In-Context Learning ] [ Privileged Information ] [ Sample Efficiency ] [ Knowledge Internalization ]


Abstract:

We formalize a new concept for LLMs, context-enhanced learning. It involves standard gradient-based learning on text except that the context is enhanced with additional data on which no auto-regressive gradients are computed. This setting is a gradient-based analog of usual in-context learning (ICL) and appears in some recent works.Using a multi-step reasoning task, we prove in a simplified setting that context-enhanced learning can be exponentially more sample-efficient than standard learning when the model is capable of ICL. At a mechanistic level, we find that the benefit of context-enhancement arises from a more accurate gradient learning signal. Our findings highlight the potential of context-enhanced learning to bridge gradient-based learning and in-context learning, offering new insights into their interplay.

Chat is not available.