Skip to yearly menu bar Skip to main content


Poster

Common Corpus: The Largest Collection of Ethical Data for LLM Pre-Training

Pierre-Carl Langlais · Pavel Chizhov · Catherine Arnett · Carlos Hinostroza · Mattia Nee · Eliot Jones · Irène Girard · David Mach · Anastasia Stasenko · Ivan Yamshchikov

Abstract

Log in and register to view live content