Skip to yearly menu bar Skip to main content


Think Outside the Bot: Automating Evaluation of Creativity in LLMs for Physical Reasoning with Semantic Entropy and Efficient Multi-Agent Judge

Min Sen Tan · Zachary Choy · Swaagat Saikia · Syed Ali Redha Alsagoff · Banerjee Mohor · Nadya Wangsajaya · Alvin Chan

Abstract

Chat is not available.