Artificial Intelligence › Core AI ›

Interpretability

7318 directly classified papers

Papers per year

Papers

No Need for Explanations: LLMs can implicitly learn from mistakes in-context EMNLP 2025

Quantifying Logical Consistency in Transformers via Query-Key Alignment EMNLP 2025

Probing Subphonemes in Morphology Models ACL 2025

Towards Explainable Hate Speech Detection ACL 2025

Evaluating Intermediate Reasoning of Code-Assisted Large Language Models for Mathematics ACL 2025

A Tale of Evaluating Factual Consistency: Case Study on Long Document Summarization Evaluation ACL 2025

Explainable Depression Detection in Clinical Interviews with Personalized Retrieval-Augmented Generation ACL 2025

The Confidence Paradox: Can LLM Know When It’s Wrong? IJCNLP 2025

From Calculation to Adjudication: Examining LLM Judges on Mathematical Reasoning Tasks ACL 2025

Don’t Miss the Forest for the Trees: Attentional Vision Calibration for Large Vision Language Models ACL 2025

Behavioural vs. Representational Systematicity in End-to-End Models: An Opinionated Survey ACL 2025

Interpreting the Effects of Quantization on LLMs IJCNLP 2025

IRIS: Interpretable Retrieval-Augmented Classification for Long Interspersed Document Sequences ACL 2025

PRISM: A Framework for Producing Interpretable Political Bias Embeddings with Political-Aware Cross-Encoder ACL 2025

(Towards) Scalable Reliable Automated Evaluation with Large Language Models ACL 2025

Inherent and emergent liability issues in LLM-based agentic systems: a principal-agent perspective ACL 2025

AutoCT: Automating Interpretable Clinical Trial Prediction with LLM Agents EMNLP 2025

Language Models Grow Less Humanlike beyond Phase Transition ACL 2025

Mamba Knockout for Unraveling Factual Information Flow ACL 2025

Spatial Representation of Large Language Models in 2D Scene ACL 2025

Decoding Knowledge Attribution in Mixture-of-Experts: A Framework of Basic-Refinement Collaboration and Efficiency Analysis ACL 2025

Dynamic Head Selection for Neural Lexicalized Constituency Parsing ACL 2025

Free-text Rationale Generation under Readability Level Control ACL 2025

All for One: LLMs Solve Mental Math at the Last Token With Information Transferred From Other Tokens EMNLP 2025

Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models ACL 2025