Artificial Intelligence › Core AI ›

Interpretability

7318 directly classified papers

Papers per year

Papers

What Makes a Good Reasoning Chain? Uncovering Structural Patterns in Long Chain-of-Thought Reasoning EMNLP 2025

ViLBench: A Suite for Vision-Language Process Reward Modeling EMNLP 2025

Noise, Adaptation, and Strategy: Assessing LLM Fidelity in Decision-Making EMNLP 2025

Investigating Value-Reasoning Reliability in Small Large Language Models EMNLP 2025

DiscoSG: Towards Discourse-Level Text Scene Graph Parsing through Iterative Graph Refinement EMNLP 2025

Probing for Arithmetic Errors in Language Models EMNLP 2025

Robust Native Language Identification through Agentic Decomposition EMNLP 2025

When Annotators Disagree, Topology Explains: Mapper, a Topological Tool for Exploring Text Embedding Geometry and Ambiguity EMNLP 2025

Self-Critique and Refinement for Faithful Natural Language Explanations EMNLP 2025

Steering Language Models in Multi-Token Generation: A Case Study on Tense and Aspect EMNLP 2025

Reason to Rote: Rethinking Memorization in Reasoning EMNLP 2025

VELA: An LLM-Hybrid-as-a-Judge Approach for Evaluating Long Image Captions EMNLP 2025

Sheaf Discovery with Joint Computation Graph Pruning and Flexible Granularity EMNLP 2025

HVGuard: Utilizing Multimodal Large Language Models for Hateful Video Detection EMNLP 2025

Token-Aware Editing of Internal Activations for Large Language Model Alignment EMNLP 2025

Measuring Chain of Thought Faithfulness by Unlearning Reasoning Steps EMNLP 2025

SPaRC: A Spatial Pathfinding Reasoning Challenge EMNLP 2025

Towards Faithful Natural Language Explanations: A Study Using Activation Patching in Large Language Models EMNLP 2025

Calibrating LLM Confidence by Probing Perturbed Representation Stability EMNLP 2025

Feature Extraction and Steering for Enhanced Chain-of-Thought Reasoning in Language Models EMNLP 2025

Long-Form Information Alignment Evaluation Beyond Atomic Facts EMNLP 2025

LATTE: Learning to Think with Vision Specialists EMNLP 2025

Back Attention: Understanding and Enhancing Multi-Hop Reasoning in Large Language Models EMNLP 2025

DEL-ToM: Inference-Time Scaling for Theory-of-Mind Reasoning via Dynamic Epistemic Logic EMNLP 2025

Uncertainty-Aware Regularization for Image-to-Image Translation WACV 2025