Skip to yearly menu bar Skip to main content


Poster

On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback

Marcus Williams · Micah Carroll · Adhyyan Narang · Constantin Weisser · Brendan Murphy · Anca Dragan
2025 Poster

Abstract

Video

Chat is not available.