Poster
in
Workshop: I Can't Believe It's Not Better: Challenges in Applied Deep Learning

Challenges of Decomposing Tools in Surgical Scenes Through Disentangling The Latent Representations

Sai Lokesh Gorantla ⋅ Raviteja Sista ⋅ Apoorva Srivastava ⋅ Utpal De ⋅ Partha Chakrabarti ⋅ Debdoot Sheet

Project Page [ OpenReview]

Abstract

Image generation through disentangling object representations is a critical area of research with significant potential. Disentanglement involves separating the representation of objects and their attributes, enabling greater control over the generated output. However, existing approaches are limited to disentangling only the objects’ attributes and generating images with selected combinations of attributes. This study explores learning object-level disentanglement of semantically rich latent representation using von-Mises-Fisher (vMF) distributions. The proposed approach aims to disentangle compressed representations into object and background classes. The approach is tested on surgical scenes for disentanglement of tools and background information using the Cholec80 dataset. Achieving tool-background disentanglement provides an opportunity to generate rare and custom surgical scenes. However, the proposed method learns to disentangle representations based on pixel intensities. This study uncovers the challenges and shortfalls in achieving object-level disentanglement of the compressed representations using vMF distributions. The code for this study is available at https://github.com/it-is-lokesh/vMF-disentanglement-challenges.

Chat is not available.