Amit Dhurandhar
29 papers · 2008–2025 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
🐝 Cross-Pollinator (14) 🌍 Conference Polyglot (9) 🧭 Keyword Pioneer 🏃 Academic Marathon (17) 🌈 Renaissance Researcher (8)
🌈
Renaissance Researcher
(8)
🌍
Conference Polyglot
(9)
🏃
Academic Marathon
(17)
🌟
Keyword Trendsetter Combo
(5)
🤝
Dynamic Duo
(12)
🏆
Grand Slam
🔬
Deep Specialist
(12)
🧬
Topic Evolution
🏆
Keyword Champion
(4)
👥
Mega-Team
(20)
🔥
Unstoppable
(6)
⚡
Prolific Year
(5)
❓
The Questioner
(3)
💎
Century Club
(29)
🗃️
Keyword Collector
(109)
Conferences
NIPS (8)
AAAI (4)
ACL (4)
ICML (4)
ICLR (3)
JMLR (3)
AISTATS (1)
ECCV (1)
EMNLP (1)
Top co-authors
Keywords
local explanation
(4)
feature attribution
(4)
explainable ai
(4)
large language model
(4)
neural network
(3)
transfer learning
(3)
invariant risk minimization
(3)
decision tree
(3)
domain generalization
(3)
model interpretability
(3)
contrastive explanation
(2)
interpretable model
(2)
nash equilibrium
(2)
data augmentation
(1)
knowledge distillation
(1)
model selection
(1)
knowledge transfer
(1)
text classification
(1)
semi-supervised learning
(1)
contrastive learning
(1)
Papers
Multi-Level Explanations for Generative Language Models
ACL 2025
Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational Agents
ACL 2025
Programming Refusal with Conditional Activation Steering
ICLR 2025
Ranking Large Language Models without Ground Truth
ACL 2024
NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models
ACL 2024
Trust Regions for Explanations via Black-Box Probabilistic Certification
ICML 2024
Integrating Markov Blanket Discovery into Causal Representation Learning for Domain Generalization
ECCV 2024
Local Explanations for Reinforcement Learning
AAAI 2023
Reprogramming Pretrained Language Models for Antibody Sequence Infilling
ICML 2023
Locally Invariant Explanations: Towards Stable and Unidirectional Explanations through Local Invariant Learning
NIPS 2023
When Neural Networks Fail to Generalize? A Model Sensitivity Perspective
AAAI 2023
Auto-Transfer: Learning to Route Transferable Representations
ICLR 2022
Is this the Right Neighborhood? Accurate and Query Efficient Model Agnostic Explanations
NIPS 2022
On the Safety of Interpretable Machine Learning: A Maximum Deviation Approach
NIPS 2022
AI Explainability 360: Impact and Design
AAAI 2022
Let the CAT out of the bag: Contrastive Attributed explanations for Text
EMNLP 2022
Linear Regression Games: Convergence Guarantees to Approximate Out-of-Distribution Solutions
AISTATS 2021
CoFrNets: Interpretable Neural Architecture Inspired by Continued Fractions
NIPS 2021
Empirical or Invariant Risk Minimization? A Sample Complexity Perspective
ICLR 2021
Anomaly Attribution with Likelihood Compensation
AAAI 2021
AI Explainability 360: An Extensible Toolkit for Understanding Data and Machine Learning Models
JMLR 2020
Learning Global Transparent Models consistent with Local Contrastive Explanations
NIPS 2020
Model Agnostic Multilevel Explanations
NIPS 2020
Invariant Risk Minimization Games
ICML 2020
Enhancing Simple Models by Exploiting What They Already Know
ICML 2020
Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives
NIPS 2018
Improving Simple Models with Confidence Profiles
NIPS 2018
Efficient and Accurate Methods for Updating Generalized Linear Models with Multiple Feature Additions
JMLR 2014
Probabilistic Characterization of Random Decision Trees
JMLR 2008