Red Teaming the Law: An Adversarial Approach to Legal Alignment
Rui-Jie Yew ⋅ Greg Demirchyan
Abstract
A core steering mechanism in a piece of regulation may lie not in what is written within it --- the primal path it lays to follow its rules--- but in what is left out of it --- the dual path it lays to avoid them. In this early-stage work, we frame a process of legal alignment (the alignment of a piece of regulation's text with its goals) as a multi-agent zero-sum game. Our hope is to demonstrate the relevance of AI alignment methods to legal oversight and the value of an adversarial frame for policy design.
Chat is not available.
Successful Page Load