Artificial Intelligence › Core AI ›

Interpretability

7318 directly classified papers

Papers per year

Papers

DeepCAVE: A Visualization and Analysis Tool for Automated Machine Learning JMLR 2025

Evaluating Sensitivity Consistency of Explanations WACV 2025

Information-Guided Identification of Training Data Imprint in (Proprietary) Large Language Models NAACL 2025

Mapping the Minds of LLMs: A Graph-Based Analysis of Reasoning LLMs EMNLP 2025

An Interpretable and Crosslingual Method for Evaluating Second-Language Dialogues NAACL 2025

BANMIME : Misogyny Detection with Metaphor Explanation on Bangla Memes EMNLP 2025

Imitate Before Detect: Aligning Machine Stylistic Preference for Machine-Revised Text Detection AAAI 2025

Discourse-Driven Evaluation: Unveiling Factual Inconsistency in Long Document Summarization NAACL 2025

Cascaded Information Disclosure for Generalized Evaluation of Problem Solving Capabilities IJCNLP 2025

Identifying Pre-training Data in LLMs: A Neuron Activation-Based Detection Framework EMNLP 2025

Benchmarking and Understanding Compositional Relational Reasoning of LLMs AAAI 2025

What Did I Do Wrong? Quantifying LLMs’ Sensitivity and Consistency to Prompt Engineering NAACL 2025

Option Symbol Matters: Investigating and Mitigating Multiple-Choice Option Symbol Bias of Large Language Models NAACL 2025

Beyond Accuracy: On the Effects of Fine-Tuning Towards Vision-Language Model’s Prediction Rationality AAAI 2025

Identifying Query-Relevant Neurons in Large Language Models for Long-Form Texts AAAI 2025

Token-Level Density-Based Uncertainty Quantification Methods for Eliciting Truthfulness of Large Language Models NAACL 2025

SafeQuant: LLM Safety Analysis via Quantized Gradient Inspection NAACL 2025

Deep Out-of-Distribution Uncertainty Quantification via Weight Entropy Maximization JMLR 2025

Seeing Eye to AI: Comparing Human Gaze and Model Attention in Video Memorability WACV 2025

AstroMLab 5: Structured Summaries and Concept Extraction for 400,000 Astrophysics Papers IJCNLP 2025

Eliciting Causal Abilities in Large Language Models for Reasoning Tasks AAAI 2025

PerCoR: Evaluating Commonsense Reasoning in Persian via Multiple-Choice Sentence Completion IJCNLP 2025

SQUAB: Evaluating LLM robustness to Ambiguous and Unanswerable Questions in Semantic Parsing EMNLP 2025

Enhancing Uncertainty Modeling with Semantic Graph for Hallucination Detection AAAI 2025

Towards Unifying Evaluation of Counterfactual Explanations: Leveraging Large Language Models for Human-Centric Assessments AAAI 2025