quentin lhoest
5 papers · 2020–2024 · 2 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+5 more ↓ Show less ↑
π§ Keyword Pioneer π Conference Polyglot (2) πΊοΈ Taxonomy Completionist (14) π£ Hot Topic Early Bird π Cross-Pollinator (15)
π
Renaissance Researcher
(5)
π
Interdisciplinary Bridge
π₯
Mega-Team
(54)
π
Keyword Champion
(2)
π
Trend Setter
Conferences
NIPS (3)
EMNLP (2)
Top co-authors
Research topics
Keywords
model pretraining
(2)
neural network training
(1)
natural language processing
(1)
language model training
(1)
distributed learning
(1)
responsible ai
(1)
machine learning
(1)
language model
(1)
model fine-tuning
(1)
pretrained model
(1)
metadata format
(1)
dataset interoperability
(1)
data management
(1)
machine learning dataset
(1)
data curation
(1)
text corpus
(1)
multilingual dataset
(1)
multilingual corpus
(1)
collaborative training
(1)
corpus curation
(1)
Papers
Croissant: A Metadata Format for ML-Ready Datasets
NIPS 2024
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
NIPS 2022
Distributed Deep Learning In Open Collaborations
NIPS 2021
Datasets: A Community Library for Natural Language Processing
EMNLP 2021
Transformers: State-of-the-Art Natural Language Processing
EMNLP 2020