Sainbayar Sukhbaatar
26 papers · 2015–2025 · 10 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
🌈 Renaissance Researcher (6) 🧭 Keyword Pioneer 🌍 Conference Polyglot (10) 🏃 Academic Marathon (10) 🌉 Interdisciplinary Bridge
🌍
Conference Polyglot
(10)
🏃
Academic Marathon
(10)
🐣
Hot Topic Early Bird
🌟
Keyword Trendsetter Combo
(4)
🌱
Topic Pioneer
🏆
Grand Slam
📈
Trend Setter
⚡
Prolific Year
(5)
🗃️
Keyword Collector
(94)
🔥
Unstoppable
(5)
💎
Century Club
(26)
🚀
Conference Pioneer
Conferences
ICML (6)
NIPS (6)
ACL (3)
EMNLP (3)
ICLR (3)
AAAI (1)
AACL (1)
CORL (1)
IJCNLP (1)
UAI (1)
Top co-authors
Keywords
language model
(6)
language modeling
(5)
text generation
(4)
reinforcement learning
(4)
sequence modeling
(2)
transformer architecture
(2)
large language model
(2)
classification head
(2)
attention mechanism
(2)
question answering
(2)
mathematical reasoning
(2)
instruction following
(2)
conditional generation
(2)
response generation
(1)
self-attention mechanism
(1)
knowledge representation
(1)
chain-of-thought reasoning
(1)
offline reinforcement learning
(1)
policy learning
(1)
word segmentation
(1)
Papers
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge
EMNLP 2025
Following Length Constraints in Instructions
EMNLP 2025
Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback
EMNLP 2025
Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces
ICLR 2025
Thinking LLMs: General Instruction Following with Thought Generation
ICML 2025
R.I.P.: Better Models by Survival of the Fittest Prompts
ICML 2025
Self-Consistency Preference Optimization
ICML 2025
Self-Rewarding Language Models
ICML 2024
Iterative Reasoning Preference Optimization
NIPS 2024
A Data Source for Reasoning Embodied Agents
AAAI 2023
Learning to Reason and Memorize with Self-Notes
NIPS 2023
The CRINGE Loss: Learning what language not to model
ACL 2023
Temporal abstractions-augmented temporally contrastive learning: An alternative to the Laplacian in RL
UAI 2022
Staircase Attention for Recurrent Processing of Sequences
NIPS 2022
Learning Goal-Conditioned Policies Offline with Self-Supervised Reward Shaping
CORL 2022
Director: Generator-Classifiers For Supervised Language Modeling
AACL 2022
Director: Generator-Classifiers For Supervised Language Modeling
IJCNLP 2022
Hash Layers For Large Sparse Models
NIPS 2021
Not All Memories are Created Equal: Learning to Forget by Expiring
ICML 2021
Adaptive Attention Span in Transformers
ACL 2019
Learning when to Communicate at Scale in Multiagent Cooperative and Competitive Tasks
ICLR 2019
Training Hybrid Language Models by Marginalizing over Segmentations
ACL 2019
Composable Planning with Attributes
ICML 2018
Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play
ICLR 2018
Learning Multiagent Communication with Backpropagation
NIPS 2016
End-To-End Memory Networks
NIPS 2015