Sachin Kumar

34 papers · 2017–2026 · 8 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🏃 Academic Marathon (8) 🌍 Conference Polyglot (8) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (12)

🌈 Renaissance Researcher (6) 🌍 Conference Polyglot (8) 🏃 Academic Marathon (8) 🤝 Dynamic Duo (23) 👥 Mega-Team (36) 🧬 Topic Evolution 🏆 Keyword Champion (3) 🗃️ Keyword Collector (164) ⚡ Prolific Year (7) ❓ The Questioner (2) 🔥 Unstoppable (7) 💎 Century Club (33)

Conferences

ACL (9) EMNLP (8) NAACL (5) NIPS (4) EACL (2) ICLR (2) IJCAI (2) IJCNLP (2)

Top co-authors

Yulia Tsvetkov (23) Noah A. Smith (7) Hannaneh Hajishirzi (6) Yejin Choi (5) Shuly Wintner (4) Vidhisha Balachandran (4) Antonios Anastasopoulos (4) Nouha Dziri (3) Xiaochuang Han (3) Valentina Pyatkin (3)

Keywords

language model (9) machine translation (5) text generation (4) large language model (3) reinforcement learning from human feedback (3) diffusion language model (3) adversarial learning (2) representation learning (2) reward model (2) low-resource language (2) domain adaptation (2) benchmark evaluation (2) zero-shot learning (2) text classification (2) diffusion model (2) language model alignment (2) adversarial training (2) responsible ai (2) gradient-based optimization (2) preference optimization (2)

Papers

ClinicalTrialsHub: Bridging Registries and Literature for Comprehensive Clinical Trial Access EACL 2026 CE-Bench: Towards a Reliable Contrastive Evaluation Benchmark of Interpretability of Sparse Autoencoders EMNLP 2025 RewardBench: Evaluating Reward Models for Language Modeling NAACL 2025 Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback ACL 2025 Steering off Course: Reliability Challenges in Steering Language Models ACL 2025 TESS 2: A Large-Scale Generalist Diffusion Language Model ACL 2025 GroundCocoa: A Benchmark for Evaluating Compositional & Conditional Reasoning in Language Models NAACL 2025 ComPO: Community Preferences for Language Model Personalization NAACL 2025 MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization NIPS 2024 The Art of Saying No: Contextual Noncompliance in Language Models NIPS 2024 Gen-Z: Generative Zero-Shot Text Classification with Contextualized Label Descriptions ICLR 2024 P3Sum: Preserving Author’s Perspective in News Summarization with Diffusion Language Models NAACL 2024 WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models NIPS 2024 Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research ACL 2024 David helps Goliath: Inference-Time Collaboration Between Small Specialized and Large General Diffusion LMs NAACL 2024 On the Blind Spots of Model-Based Evaluation Metrics for Text Generation ACL 2023 Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey EACL 2023 Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models EMNLP 2023 Mitigating Societal Harms in Large Language Models EMNLP 2023 SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control ACL 2023 Minding Language Models’ (Lack of) Theory of Mind: A Plug-and-Play Multi-Character Belief Tracker ACL 2023 Gradient-based Constrained Sampling from Language Models EMNLP 2022 Referee: Reference-Free Sentence Summarization with Sharper Controllability through Symbolic Knowledge Distillation EMNLP 2022 Machine Translation into Low-resource Language Varieties IJCNLP 2021 Controlled Text Generation as Continuous Optimization with Multiple Constraints NIPS 2021 Machine Translation into Low-resource Language Varieties ACL 2021 Improving the Diversity of Unsupervised Paraphrasing with Embedding Outputs EMNLP 2021 A Deep Reinforced Model for Zero-Shot Cross-Lingual Summarization with Bilingual Semantic Similarity Rewards ACL 2020 Neural Abstractive Summarization with Structural Attention IJCAI 2020 Topics to Avoid: Demoting Latent Confounds in Text Classification IJCNLP 2019 A Margin-based Loss with Synthetic Negative Samples for Continuous-output Machine Translation EMNLP 2019 Topics to Avoid: Demoting Latent Confounds in Text Classification EMNLP 2019 Von Mises-Fisher Loss for Training Sequence to Sequence Models with Continuous Outputs ICLR 2019 Earth Mover's Distance Pooling over Siamese LSTMs for Automatic Short Answer Grading IJCAI 2017