Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Interpretability
7318 directly classified papers
Papers per year
2003: 1
2006: 1
2007: 1
2008: 1
2009: 1
2010: 5
2012: 2
2013: 10
2014: 7
2015: 14
2016: 27
2017: 84
2018: 196
2019: 395
2020: 488
2021: 771
2022: 823
2023: 954
2024: 1360
2025: 1713
2026: 464
Papers
bea-jh at BEA 2025 Shared Task: Evaluating AI-powered Tutors through Pedagogically-Informed Reasoning
ACL 2025
Masculine Defaults via Gendered Discourse in Podcasts and Large Language Models
ACL 2025
MPBench: A Comprehensive Multimodal Reasoning Benchmark for Process Errors Identification
ACL 2025
Separating Tongue from Thought: Activation Patching Reveals Language-Agnostic Concept Representations in Transformers
ACL 2025
Features that Make a Difference: Leveraging Gradients for Improved Dictionary Learning
NAACL 2025
Emergent Wisdom at BEA 2025 Shared Task: From Lexical Understanding to Reflective Reasoning for Pedagogical Ability Assessment
ACL 2025
Unconditional Truthfulness: Learning Unconditional Uncertainty of Large Language Models
EMNLP 2025
CafGa: Customizing Feature Attributions to Explain Language Models
EMNLP 2025
Exploiting Contextual Knowledge in LLMs through 𝒱-usable Information based Layer Enhancement
ACL 2025
𝛿-Stance: A Large-Scale Real World Dataset of Stances in Legal Argumentation
ACL 2025
Emergence of symbolic abstraction heads for in-context learning in large language models
COLING 2025
SocialEval: Evaluating Social Intelligence of Large Language Models
ACL 2025
ETF: An Entity Tracing Framework for Hallucination Detection in Code Summaries
ACL 2025
Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference
NAACL 2025
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers
NAACL 2025
DeepReview: Improving LLM-based Paper Review with Human-like Deep Thinking Process
ACL 2025
Performance Gap in Entity Knowledge Extraction Across Modalities in Vision Language Models
ACL 2025
TimelyMed: AI-Driven Clinical Course Attribution and Temporal Mapping for Psychiatric Medical Records
IJCAI 2025
Know Your Mistakes: Towards Preventing Overreliance on Task-Oriented Conversational AI Through Accountability Modeling
ACL 2025
Representations of Fact, Fiction and Forecast in Large Language Models: Epistemics and Attitudes
ACL 2025
Efficient Rectification of Neuro-Symbolic Reasoning Inconsistencies by Abductive Reflection (Extended Abstract)
IJCAI 2025
Who We Are, Where We Are: Mental Health at the Intersection of Person, Situation, and Large Language Models
NAACL 2025
FoREST: Frame of Reference Evaluation in Spatial Reasoning Tasks
EMNLP 2025
FOCUS: Evaluating Pre-trained Vision-Language Models on Underspecification Reasoning
ACL 2025
Exposing the Achilles’ Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning
ACL 2025
<
1
…
22
23
24
…
293
>