Skip to yearly menu bar Skip to main content


ML-BENCH: EVALUATING LARGE LANGUAGE MODELS AND AGENTS FOR MACHINE LEARNING TASKS ON REPOSITORY-LEVEL CODE

Xiangru Tang ⋅ Yuliang Liu ⋅ Zefan Cai ⋅ Daniel Shao ⋅ Junjie Lu ⋅ Yichi Zhang ⋅ Zexuan Deng ⋅ Helan Hu ⋅ Kaikai An ⋅ Ruijun Huang ⋅ Shuzheng Si ⋅ Chen Sheng ⋅ Haozhe Zhao ⋅ Liang Chen ⋅ Tianyu Liu ⋅ Yujia Qin ⋅ Wangchunshu Zhou ⋅ Yilun Zhao ⋅ Zhiwei Jiang ⋅ Baobao Chang ⋅ Arman Cohan ⋅ Mark Gerstein

Abstract

Video

Chat is not available.