Souradip Chakraborty

20 papers · 2020–2026 · 10 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🏃 Academic Marathon (5) 🐝 Cross-Pollinator (11) 🌍 Conference Polyglot (9) 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (6)

🌈 Renaissance Researcher (6) 🧭 Keyword Pioneer 🤝 Dynamic Duo (12) 👑 Triple Crown 🏆 Grand Slam 🧬 Topic Evolution 💎 Century Club (19) 🗃️ Keyword Collector (63) ⚡ Prolific Year (5) ❓ The Questioner (2) 🔥 Unstoppable (5)

Conferences

ICML (5) AAAI (4) ICLR (3) COLING (2) CORL (1) CVPR (1) EACL (1) EMNLP (1) NIPS (1) SEMEVAL (1)

Top co-authors

Furong Huang (13) Dinesh Manocha (10) Amrit Singh Bedi (8) Mengdi Wang (7) Alec Koppel (6) Amrit Bedi (5) Soumya Suvra Ghosal (4) Avinash Reddy (2) Ekansh Verma (2) Sicheng Zhu (2)

Keywords

reinforcement learning from human feedback (2) language model (2) inference-time alignment (2) model-based reinforcement learning (2) safety alignment (2) ensemble learning (2) jailbreak attack (2) adversarial attack (2) text mining (1) policy gradient (1) data poisoning (1) information retrieval (1) text generation (1) posterior estimation (1) text classification (1) backdoor attack (1) bayesian regret (1) question answering (1) optimal decoding (1) reward function (1)

Papers

Jailbreaks as Inference-Time Alignment: A Framework for Understanding Safety Failures in LLMs EACL 2026 Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time ICML 2025 Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data? AAAI 2025 Is Poisoning a Real Threat to DPO? Maybe More So Than You Think AAAI 2025 Align-Pro: A Principled Approach to Prompt Optimization for LLM Alignment AAAI 2025 Collab: Controlled Decoding using Mixture of Agents for LLM Alignment ICLR 2025 Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment CVPR 2025 Uncertainty-Aware Answer Selection for Improved Reasoning in Multi-LLM Systems EMNLP 2025 PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback ICLR 2024 Transfer Q-star : Principled Decoding for LLM Alignment NIPS 2024 Rethinking Adversarial Policies: A Generalized Attack Formulation and Provable Defense in RL ICLR 2024 Position: On the Possibilities of AI-Generated Text Detection ICML 2024 MaxMin-RLHF: Alignment with Diverse Human Preferences ICML 2024 Posterior Coreset Construction with Kernelized Stein Discrepancy for Model-Based Reinforcement Learning AAAI 2023 STEERING : Stein Information Directed Exploration for Model-Based Reinforcement Learning ICML 2023 HTRON: Efficient Outdoor Navigation with Sparse Rewards via Heavy Tailed Adaptive Reinforce Algorithm CORL 2022 On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces ICML 2022 Transformers at SemEval-2020 Task 11: Propaganda Fragment Detection Using Diversified BERT Architectures Based Ensemble Learning COLING 2020 Transformers at SemEval-2020 Task 11: Propaganda Fragment Detection Using Diversified BERT Architectures Based Ensemble Learning SEMEVAL 2020 BioMedBERT: A Pre-trained Biomedical Language Model for QA and IR COLING 2020