Skip to yearly menu bar Skip to main content


Poster

On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback

Marcus Williams ⋅ Micah Carroll ⋅ Adhyyan Narang ⋅ Constantin Weisser ⋅ Brendan Murphy ⋅ Anca Dragan
2025 Poster

Abstract

Video

Chat is not available.