Virtual presentation / poster accept

DualAfford: Learning Collaborative Visual Affordance for Dual-gripper Manipulation

Yan Zhao ⋅ Ruihai Wu ⋅ Zhehuan Chen ⋅ Yourong Zhang ⋅ Qingnan Fan ⋅ Kaichun Mo ⋅ Hao Dong

Keywords: Visual Understanding of 3D Shapes Visual Actionable Representation for Robotics Applications

[ Slides] [ Poster] [ OpenReview]

Abstract

It is essential yet challenging for future home-assistant robots to understand and manipulate diverse 3D objects in daily human environments. Towards building scalable systems that can perform diverse manipulation tasks over various 3D shapes, recent works have advocated and demonstrated promising results learning visual actionable affordance, which labels every point over the input 3D geometry with an action likelihood of accomplishing the downstream task (e.g., pushing or picking-up). However, these works only studied single-gripper manipulation tasks, yet many real-world tasks require two hands to achieve collaboratively. In this work, we propose a novel learning framework, DualAfford, to learn collaborative affordance for dual-gripper manipulation tasks. The core design of the approach is to reduce the quadratic problem for two grippers into two disentangled yet interconnected subtasks for efficient learning. Using the large-scale PartNet-Mobility and ShapeNet datasets, we set up four benchmark tasks for dual-gripper manipulation. Experiments prove the effectiveness and superiority of our method over three baselines. We will release code and data upon acceptance.

Video

Chat is not available.