Artificial Intelligence › Core AI ›

Interpretability

7318 directly classified papers

Papers per year

Papers

Pattern Recognition or Medical Knowledge? The Problem with Multiple-Choice Questions in Medicine ACL 2025

Lost in Multilinguality: Dissecting Cross-lingual Factual Inconsistency in Transformer Language Models ACL 2025

When Annotators Disagree, Topology Explains: Mapper, a Topological Tool for Exploring Text Embedding Geometry and Ambiguity EMNLP 2025

Unveiling Language-Specific Features in Large Language Models via Sparse Autoencoders ACL 2025

ProtoLens: Advancing Prototype Learning for Fine-Grained Interpretability in Text Classification ACL 2025

PersonalizedUS: Interpretable Breast Cancer Risk Assessment with Local Coverage Uncertainty Quantification AAAI 2025

Self-Critique and Refinement for Faithful Natural Language Explanations EMNLP 2025

Interpret and Improve In-Context Learning via the Lens of Input-Label Mappings ACL 2025

Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis ACL 2025

Tree-of-Quote Prompting Improves Factuality and Attribution in Multi-Hop and Medical Reasoning EMNLP 2025

Cracking the Code of Hallucination in LVLMs with Vision-aware Head Divergence ACL 2025

Retrieve to Explain: Evidence-driven Predictions for Explainable Drug Target Identification ACL 2025

LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers NAACL 2025

ViLBench: A Suite for Vision-Language Process Reward Modeling EMNLP 2025

Towards Faithful Natural Language Explanations: A Study Using Activation Patching in Large Language Models EMNLP 2025

Enhancing Trustworthiness of Graph Neural Networks with Rank-Based Conformal Training AAAI 2025

Learning About Algorithm Auditing in Five Steps: Scaffolding How High School Youth Can Systematically and Critically Evaluate Machine Learning Applications AAAI 2025

NAVER: A Neuro-Symbolic Compositional Automaton for Visual Grounding with Explicit Logic Reasoning ICCV 2025

Enhancing Skin Disease Diagnosis: Interpretable Visual Concept Discovery with SAM WACV 2025

No Questions are Stupid, but some are Poorly Posed: Understanding Poorly-Posed Information-Seeking Questions ACL 2025

Aristotle: Mastering Logical Reasoning with A Logic-Complete Decompose-Search-Resolve Framework ACL 2025

Does Your AI Agent Get You? A Personalizable Framework for Approximating Human Models from Argumentation-based Dialogue Traces AAAI 2025

Position-aware Automatic Circuit Discovery ACL 2025

Assessment and manipulation of latent constructs in pre-trained language models using psychometric scales ACL 2025

Active Fourier Auditor for Estimating Distributional Properties of ML Models AAAI 2025