Skip to yearly menu bar Skip to main content


ML-BENCH: EVALUATING LARGE LANGUAGE MODELS AND AGENTS FOR MACHINE LEARNING TASKS ON REPOSITORY-LEVEL CODE

Xiangru Tang · Yuliang Liu · Zefan Cai · Daniel Shao · Junjie Lu · Yichi Zhang · Zexuan Deng · Helan Hu · Kaikai An · Ruijun Huang · Shuzheng Si · Chen Sheng · Haozhe Zhao · Liang Chen · Tianyu Liu · Yujia Qin · Wangchunshu Zhou · Yilun Zhao · Zhiwei Jiang · Baobao Chang · Arman Cohan · Mark Gerstein

Abstract

Video

Chat is not available.