Amit Dhurandhar

29 papers · 2008–2025 · 9 conferences · across top CS/AI conferences

Achievements

+15 more ↓

🐝 Cross-Pollinator (14) 🌍 Conference Polyglot (9) 🧭 Keyword Pioneer 🏃 Academic Marathon (17) 🌈 Renaissance Researcher (8)

🌈 Renaissance Researcher (8) 🌍 Conference Polyglot (9) 🏃 Academic Marathon (17) 🌟 Keyword Trendsetter Combo (5) 🤝 Dynamic Duo (12) 🏆 Grand Slam 🔬 Deep Specialist (12) 🧬 Topic Evolution 🏆 Keyword Champion (4) 👥 Mega-Team (20) 🔥 Unstoppable (6) ⚡ Prolific Year (5) ❓ The Questioner (3) 💎 Century Club (29) 🗃️ Keyword Collector (109)

Conferences

NIPS (8) AAAI (4) ACL (4) ICML (4) ICLR (3) JMLR (3) AISTATS (1) ECCV (1) EMNLP (1)

Top co-authors

Karthikeyan Shanmugam (12) Ronny Luss (10) Karthikeyan Natesan Ramamurthy (8) Dennis Wei (6) Pin-Yu Chen (6) Moninder Singh (5) Kartik Ahuja (4) Yunfeng Zhang (3) Vijay Arya (3) Payel Das (3)

Keywords

local explanation (4) feature attribution (4) explainable ai (4) large language model (4) neural network (3) transfer learning (3) invariant risk minimization (3) decision tree (3) domain generalization (3) model interpretability (3) contrastive explanation (2) interpretable model (2) nash equilibrium (2) data augmentation (1) knowledge distillation (1) model selection (1) knowledge transfer (1) text classification (1) semi-supervised learning (1) contrastive learning (1)

Papers

Multi-Level Explanations for Generative Language Models ACL 2025 Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational Agents ACL 2025 Programming Refusal with Conditional Activation Steering ICLR 2025 Ranking Large Language Models without Ground Truth ACL 2024 NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models ACL 2024 Trust Regions for Explanations via Black-Box Probabilistic Certification ICML 2024 Integrating Markov Blanket Discovery into Causal Representation Learning for Domain Generalization ECCV 2024 Local Explanations for Reinforcement Learning AAAI 2023 Reprogramming Pretrained Language Models for Antibody Sequence Infilling ICML 2023 Locally Invariant Explanations: Towards Stable and Unidirectional Explanations through Local Invariant Learning NIPS 2023 When Neural Networks Fail to Generalize? A Model Sensitivity Perspective AAAI 2023 Auto-Transfer: Learning to Route Transferable Representations ICLR 2022 Is this the Right Neighborhood? Accurate and Query Efficient Model Agnostic Explanations NIPS 2022 On the Safety of Interpretable Machine Learning: A Maximum Deviation Approach NIPS 2022 AI Explainability 360: Impact and Design AAAI 2022 Let the CAT out of the bag: Contrastive Attributed explanations for Text EMNLP 2022 Linear Regression Games: Convergence Guarantees to Approximate Out-of-Distribution Solutions AISTATS 2021 CoFrNets: Interpretable Neural Architecture Inspired by Continued Fractions NIPS 2021 Empirical or Invariant Risk Minimization? A Sample Complexity Perspective ICLR 2021 Anomaly Attribution with Likelihood Compensation AAAI 2021 AI Explainability 360: An Extensible Toolkit for Understanding Data and Machine Learning Models JMLR 2020 Learning Global Transparent Models consistent with Local Contrastive Explanations NIPS 2020 Model Agnostic Multilevel Explanations NIPS 2020 Invariant Risk Minimization Games ICML 2020 Enhancing Simple Models by Exploiting What They Already Know ICML 2020 Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives NIPS 2018 Improving Simple Models with Confidence Profiles NIPS 2018 Efficient and Accurate Methods for Updating Generalized Linear Models with Multiple Feature Additions JMLR 2014 Probabilistic Characterization of Random Decision Trees JMLR 2008