Skip to yearly menu bar Skip to main content


(5 events)   Timezone:  
Show all
Toggle Poster Visibility
Oral
Fri Apr 25 07:30 PM -- 07:42 PM (PDT) @ Garnet 216-218 None
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
Samuel Marks · Can Rager · Eric Michaud · Yonatan Belinkov · David Bau · Aaron Mueller
[ Slides [ OpenReview
Oral
Fri Apr 25 07:42 PM -- 07:54 PM (PDT) @ Garnet 216-218 None
Unlearning-based Neural Interpretations
Ching Lam Choi · Alexandre Duplessis · Serge Belongie
[ OpenReview
Oral
Fri Apr 25 07:54 PM -- 08:06 PM (PDT) @ Garnet 216-218 None
Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition
Jiyeon Kim · Hyunji Lee · Hyowon Cho · Joel Jang · Hyeonbin Hwang · Seungpil Won · Youbin Ahn · Dohaeng Lee · Minjoon Seo
[ OpenReview
Oral
Fri Apr 25 08:06 PM -- 08:18 PM (PDT) @ Garnet 216-218 None
Cross-Entropy Is All You Need To Invert the Data Generating Process
Patrik Reizinger · Alice Bizeul · Attila Juhos · Julia E Vogt · Randall Balestriero · Wieland Brendel · David Klindt
[ OpenReview
Oral
Fri Apr 25 08:18 PM -- 08:30 PM (PDT) @ Garnet 216-218 None
Can a MISL Fly? Analysis and Ingredients for Mutual Information Skill Learning
Chongyi Zheng · Jens Tuyls · Joanne Peng · Benjamin Eysenbach
[ Slides [ OpenReview