Skip to yearly menu bar Skip to main content


Understanding Reasoning in Thinking Language Models via Steering Vectors

Constantin Venhoff · Iván Arcuschin · Philip Torr · Arthur Conmy · Neel Nanda

Abstract

Chat is not available.