Learning Representations of Instruments for Partial Identification of Treatment Effects
Abstract
Reliable estimation of treatment effects from observational data is crucial in fields like medicine, yet challenging when the unconfoundedness assumption is violated. We leverage arbitrary (potentially high-dimensional) instruments to estimate bounds on the conditional average treatment effect (CATE). Our contributions are three-fold: (1) We propose a novel approach for partial identification by mapping instruments into a discrete representation space that yields valid CATE bounds, essential for reliable decision-making. (2) We derive a two-step procedure that learns tight bounds via neural partitioning of the latent instrument space, thereby avoiding instability from numerical approximations or adversarial training and reducing finite-sample variance. (3) We provide theoretical guarantees for valid bounds with reduced variance and demonstrate effectiveness through extensive experiments. Overall, our method offers a new avenue for practitioners to exploit high-dimensional instruments (e.g., in Mendelian randomization).