Skip to yearly menu bar Skip to main content


AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding

Ahmed Masry · Juan A. Rodriguez · Tianyu Zhang · Suyuchen Wang · Chao Wang · Aarash Feizi · Akshay Suresh · Abhay Puri · Xiangru Jian · Pierre-André Noël · Sathwik Tejaswi Madhusudhan · Marco Pedersoli · Bang Liu · Nicolas Chapados · Yoshua Bengio · Enamul Hoque · Christopher Pal · Issam Laradji · David Vazquez · Perouz Taslakian · Spandana Gella · Sai Rajeswar

Abstract

Video

Chat is not available.