Skip to yearly menu bar Skip to main content


Poster

Combatting Dimensional Collapse in LLM Pre-Training Data via Submodular File Selection

Ziqing Fan ⋅ Siyuan Du ⋅ Shengchao Hu ⋅ Pingjie Wang ⋅ Li Shen ⋅ Ya Zhang ⋅ Dacheng Tao ⋅ Yanfeng Wang
2025 Poster

Abstract

Video

Chat is not available.