Poster
in
Workshop: World Models: Understanding, Modelling and Scaling
Unifying Causal and Object-centric Representation Learning allows Causal Composition
Avinash Kori · Ben Glocker · David Ha · Francesco Locatello
Keywords: [ causal representation learning ] [ scene graphs ] [ object-centric learning ] [ composition ]
The goal of Object-Centric Learning (OCL) is to enable machine learning systems to decompose complex scenes into discrete, interacting objects, supporting compositional generalization and human-like reasoning. However, existing OCL methods often fail to capture interactions from both attribute-level (semantic) and object-level (spatial) perspectives. While scene graph methods complement OCL by abstracting scenes as structured graphs, they typically rely on supervision. This position paper argues for a probabilistic perspective on Scene Graph Modelling (SGM), grounded in causal abstraction as a unifying view on causality, OCL, and scene graphs by considering object interactions as invariant mechanisms within object-level graphs, enabling us to generate causally consistent scene compositions. We substantiate our position with thorough conceptual discussion, rigorous definitions, conjectures, and examples, demonstrating how this perspective bridges the gap between unsupervised object discovery and explicit scene graph reasoning.