Teven Le Scao
12 papers · 2020–2025 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+8 more ↓ Show less ↑
π Renaissance Researcher (5) π Interdisciplinary Bridge π Conference Polyglot (9) π Academic Marathon (5) πΊοΈ Taxonomy Completionist (20)
π£
Hot Topic Early Bird
π
Conference Polyglot
(9)
π
Academic Marathon
(5)
π
Triple Crown
π₯
Mega-Team
(54)
π
Century Club
(12)
π
Trend Setter
β
The Questioner
(3)
Conferences
EMNLP (3)
NIPS (2)
AACL (1)
ACL (1)
ICLR (1)
ICML (1)
IJCNLP (1)
JMLR (1)
NAACL (1)
Top co-authors
Research topics
Keywords
zero-shot generalization
(3)
large language model
(3)
transformer architecture
(3)
language model
(2)
data repetition
(2)
model scaling
(2)
multilingual language model
(2)
prompt engineering
(2)
model pretraining
(1)
scaling law
(1)
language model training
(1)
natural language processing
(1)
model fine-tuning
(1)
scaling behavior
(1)
model training
(1)
data curation
(1)
multilingual model
(1)
token efficiency
(1)
pretrained model
(1)
pretraining corpus
(1)
Papers
Scaling Data-Constrained Language Models
JMLR 2025
Joint Representations of Text and Knowledge Graphs for Retrieval and Evaluation
IJCNLP 2023
Scaling Data-Constrained Language Models
NIPS 2023
Joint Representations of Text and Knowledge Graphs for Retrieval and Evaluation
AACL 2023
Crosslingual Generalization through Multitask Finetuning
ACL 2023
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
NIPS 2022
What Language Model to Train if You Have One Million GPU Hours?
EMNLP 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
ICLR 2022
What Language Model Architecture and Pretraining Objective Works Best for Zero-Shot Generalization?
ICML 2022
Datasets: A Community Library for Natural Language Processing
EMNLP 2021
How many data points is a prompt worth?
NAACL 2021
Transformers: State-of-the-Art Natural Language Processing
EMNLP 2020