Skip to yearly menu bar Skip to main content


Poster

Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!

Xiangyu Qi ⋅ Yi Zeng ⋅ Tinghao Xie ⋅ Pin-Yu Chen ⋅ Ruoxi Jia ⋅ Prateek Mittal ⋅ Peter Henderson
2024 Poster

Abstract

Video

Chat is not available.