Skip to yearly menu bar Skip to main content


Poster

Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models

Yanjiang Liu · Shuheng Zhou · Yaojie Lu · Huijia Zhu · Weiqiang Wang · Hongyu Lin · Ben He · Xianpei Han · Le Sun

Abstract

Log in and register to view live content