Artificial Intelligence › Core AI ›

Interpretability

7318 directly classified papers

Papers per year

Papers

Identifying Drivers of Predictive Aleatoric Uncertainty IJCAI 2025

XDAC: XAI-Driven Detection and Attribution of LLM-Generated News Comments in Korean ACL 2025

Keys to Robust Edits: From Theoretical Insights to Practical Advances ACL 2025

Verifying Quantized Graph Neural Networks is PSPACE-complete IJCAI 2025

PropXplain: Can LLMs Enable Explainable Propaganda Detection? EMNLP 2025

Circuit-Tracer: A New Library for Finding Feature Circuits EMNLP 2025

Disentangling Language and Culture for Evaluating Multilingual Large Language Models ACL 2025

Aligned but Blind: Alignment Increases Implicit Bias by Reducing Awareness of Race ACL 2025

Explainable Graph Representation Learning via Graph Pattern Analysis IJCAI 2025

Can LLMs Reason About Program Semantics? A Comprehensive Evaluation of LLMs on Formal Specification Inference ACL 2025

Prompt-Guided Internal States for Hallucination Detection of Large Language Models ACL 2025

DGExplainer: Explaining Dynamic Graph Neural Networks via Relevance Back-propagation IJCAI 2025

Generative Annotation for ASR Named Entity Correction EMNLP 2025

Exclusion of Thought: Mitigating Cognitive Load in Large Language Models for Enhanced Reasoning in Multiple-Choice Tasks ACL 2025

Enhancing Goal-oriented Proactive Dialogue Systems via Consistency Reflection and Correction ACL 2025

Explainable Graph Neural Networks via Structural Externalities IJCAI 2025

CogniBench: A Legal-inspired Framework and Dataset for Assessing Cognitive Faithfulness of Large Language Models ACL 2025

Neuron Empirical Gradient: Discovering and Quantifying Neurons’ Global Linear Controllability ACL 2025

ASCENT-ViT: Attention-based Scale-aware Concept Learning Framework for Enhanced Alignment in Vision Transformers IJCAI 2025

How Well Can Reasoning Models Identify and Recover from Unhelpful Thoughts? EMNLP 2025

Simulating Identity, Propagating Bias: Abstraction and Stereotypes in LLM-Generated Text EMNLP 2025

AdvERSEM: Adversarial Robustness Testing and Training of LLM-based Groundedness Evaluators via Semantic Structure Manipulation EMNLP 2025

VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos ACL 2025

Using Shapley interactions to understand how models use structure ACL 2025

Priority Guided Explanation for Knowledge Tracing with Dual Ranking and Similarity Consistency IJCAI 2025