Skip to yearly menu bar Skip to main content


Poster

BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks

Juan A. Rodriguez ⋅ Xiangru Jian ⋅ Siba Smarak Panigrahi ⋅ Tianyu Zhang ⋅ Aarash Feizi ⋅ Abhay Puri ⋅ Akshay Suresh ⋅ François Savard ⋅ Ahmed Masry ⋅ Shravan Nayak ⋅ Rabiul Awal ⋅ Mahsa Massoud ⋅ Amirhossein Abaskohi ⋅ Zichao Li ⋅ Suyuchen Wang ⋅ Pierre-André Noël ⋅ Mats L. Richter ⋅ Saverio Vadacchino ⋅ Shubham Agarwal ⋅ Sanket Biswas ⋅ Sara Shanian ⋅ Ying Zhang ⋅ Sathwik Tejaswi Madhusudhan ⋅ Joao Monteiro ⋅ Krishnamurthy Dvijotham ⋅ Torsten Scholak ⋅ Nicolas Chapados ⋅ Sepideh Kharaghani ⋅ Sean Hughes ⋅ M. Tamer Özsu ⋅ Siva Reddy ⋅ Marco Pedersoli ⋅ Yoshua Bengio ⋅ Christopher Pal ⋅ Issam Laradji ⋅ Spandana Gella ⋅ Perouz Taslakian ⋅ David Vazquez ⋅ Sai Rajeswar
2025 Poster

Abstract

Video

Chat is not available.