Sainbayar Sukhbaatar

26 papers · 2015–2025 · 10 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🌈 Renaissance Researcher (6) 🧭 Keyword Pioneer 🌍 Conference Polyglot (10) 🏃 Academic Marathon (10) 🌉 Interdisciplinary Bridge

🌍 Conference Polyglot (10) 🏃 Academic Marathon (10) 🐣 Hot Topic Early Bird 🌟 Keyword Trendsetter Combo (4) 🌱 Topic Pioneer 🏆 Grand Slam 📈 Trend Setter ⚡ Prolific Year (5) 🗃️ Keyword Collector (94) 🔥 Unstoppable (5) 💎 Century Club (26) 🚀 Conference Pioneer

Conferences

ICML (6) NIPS (6) ACL (3) EMNLP (3) ICLR (3) AAAI (1) AACL (1) CORL (1) IJCNLP (1) UAI (1)

Top co-authors

arthur szlam (8) Jason E Weston (8) Jason Weston (8) Weizhe Yuan (7) Jing Xu (6) Rob Fergus (4) Tianhao Wu (4) Piotr Bojanowski (3) Yuandong Tian (3) Kyunghyun Cho (3)

Keywords

language model (6) language modeling (5) text generation (4) reinforcement learning (4) sequence modeling (2) transformer architecture (2) large language model (2) classification head (2) attention mechanism (2) question answering (2) mathematical reasoning (2) instruction following (2) conditional generation (2) response generation (1) self-attention mechanism (1) knowledge representation (1) chain-of-thought reasoning (1) offline reinforcement learning (1) policy learning (1) word segmentation (1)

Papers

Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge EMNLP 2025 Following Length Constraints in Instructions EMNLP 2025 Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback EMNLP 2025 Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces ICLR 2025 Thinking LLMs: General Instruction Following with Thought Generation ICML 2025 R.I.P.: Better Models by Survival of the Fittest Prompts ICML 2025 Self-Consistency Preference Optimization ICML 2025 Self-Rewarding Language Models ICML 2024 Iterative Reasoning Preference Optimization NIPS 2024 A Data Source for Reasoning Embodied Agents AAAI 2023 Learning to Reason and Memorize with Self-Notes NIPS 2023 The CRINGE Loss: Learning what language not to model ACL 2023 Temporal abstractions-augmented temporally contrastive learning: An alternative to the Laplacian in RL UAI 2022 Staircase Attention for Recurrent Processing of Sequences NIPS 2022 Learning Goal-Conditioned Policies Offline with Self-Supervised Reward Shaping CORL 2022 Director: Generator-Classifiers For Supervised Language Modeling AACL 2022 Director: Generator-Classifiers For Supervised Language Modeling IJCNLP 2022 Hash Layers For Large Sparse Models NIPS 2021 Not All Memories are Created Equal: Learning to Forget by Expiring ICML 2021 Adaptive Attention Span in Transformers ACL 2019 Learning when to Communicate at Scale in Multiagent Cooperative and Competitive Tasks ICLR 2019 Training Hybrid Language Models by Marginalizing over Segmentations ACL 2019 Composable Planning with Attributes ICML 2018 Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play ICLR 2018 Learning Multiagent Communication with Backpropagation NIPS 2016 End-To-End Memory Networks NIPS 2015