Skip to yearly menu bar Skip to main content


ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models

Jieyu Zhang · Le Xue · Linxin Song · Jun Wang · Weikai Huang · Manli Shu · An Yan · Zixian Ma · Juan Carlos Niebles · silvio savarese · Caiming Xiong · Zeyuan Chen · Ranjay Krishna · Ran Xu

Abstract

Chat is not available.