Artificial Intelligence › Core AI ›

Interpretability

7318 directly classified papers

Papers per year

Papers

Semi-supervised Concept Bottleneck Models ICCV 2025

LibraGrad: Balancing Gradient Flow for Universally Better Vision Transformer Attributions CVPR 2025

Dissecting and Mitigating Diffusion Bias via Mechanistic Interpretability CVPR 2025

Explaining in Diffusion: Explaining a Classifier with Diffusion Semantics CVPR 2025

Interpretable Image Classification via Non-parametric Part Prototype Learning CVPR 2025

Bridging the Gap Between Ideal and Real-world Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios ICCV 2025

Prompt-CAM: Making Vision Transformers Interpretable for Fine-Grained Analysis CVPR 2025

VERA: Explainable Video Anomaly Detection via Verbalized Learning of Vision-Language Models CVPR 2025

Intervening in Black Box: Concept Bottleneck Model for Enhancing Human Neural Network Mutual Understanding ICCV 2025

Can Large Vision-Language Models Correct Semantic Grounding Errors By Themselves? CVPR 2025

Interactive Medical Image Analysis with Concept-based Similarity Reasoning CVPR 2025

Probing and Boosting Large Language Models Capabilities via Attention Heads EMNLP 2025

Learning from Sufficient Rationales: Analysing the Relationship Between Explanation Faithfulness and Token-level Regularisation Strategies IJCNLP 2025

Leveraging Spatial Invariance to Boost Adversarial Transferability ICCV 2025

Language Arithmetics: Towards Systematic Language Neuron Identification and Manipulation IJCNLP 2025

Understanding and Controlling Repetition Neurons and Induction Heads in In-Context Learning IJCNLP 2025

AIM: Amending Inherent Interpretability via Self-Supervised Masking ICCV 2025

SUB: Benchmarking CBM Generalization via Synthetic Attribute Substitutions ICCV 2025

Comprehensive Information Bottleneck for Unveiling Universal Attribution to Interpret Vision Transformers CVPR 2025

Derivative-Free Diffusion Manifold-Constrained Gradient for Unified XAI CVPR 2025

Large Language Models Encode Semantics and Alignment in Linearly Separable Representations IJCNLP 2025

Interpreting the Effects of Quantization on LLMs IJCNLP 2025

NAVER: A Neuro-Symbolic Compositional Automaton for Visual Grounding with Explicit Logic Reasoning ICCV 2025

The Confidence Paradox: Can LLM Know When It’s Wrong? IJCNLP 2025

The Visual Counter Turing Test (VCT²): A Benchmark for Evaluating AI-Generated Image Detection and the Visual AI Index (V_AI) IJCNLP 2025