Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Interpretability
7318 directly classified papers
Papers per year
2003: 1
2006: 1
2007: 1
2008: 1
2009: 1
2010: 5
2012: 2
2013: 10
2014: 7
2015: 14
2016: 27
2017: 84
2018: 196
2019: 395
2020: 488
2021: 771
2022: 823
2023: 954
2024: 1360
2025: 1713
2026: 464
Papers
Neuron-Level Differentiation of Memorization and Generalization in Large Language Models
EMNLP 2025
Sparse Neurons Carry Strong Signals of Question Ambiguity in LLMs
EMNLP 2025
Promote, Suppress, Iterate: How Language Models Answer One-to-Many Factual Queries
EMNLP 2025
Enhancing Chain-of-Thought Reasoning via Neuron Activation Differential Analysis
EMNLP 2025
When Truthful Representations Flip Under Deceptive Instructions?
EMNLP 2025
Prototypical Human-AI Collaboration Behaviors from LLM-Assisted Writing in the Wild
EMNLP 2025
ThinkEdit: Interpretable Weight Editing to Mitigate Overly Short Thinking in Reasoning Models
EMNLP 2025
PychoAgent: Psychology-driven LLM Agents for Explainable Panic Prediction on Social Media during Sudden Disaster Events
EMNLP 2025
GuessingGame: Measuring the Informativeness of Open-Ended Questions in Large Language Models
EMNLP 2025
V-SEAM: Visual Semantic Editing and Attention Modulating for Causal Interpretability of Vision-Language Models
EMNLP 2025
Temporal Referential Consistency: Do LLMs Favor Sequences Over Absolute Time References?
EMNLP 2025
Mapping the Minds of LLMs: A Graph-Based Analysis of Reasoning LLMs
EMNLP 2025
BANMIME : Misogyny Detection with Metaphor Explanation on Bangla Memes
EMNLP 2025
SQUAB: Evaluating LLM robustness to Ambiguous and Unanswerable Questions in Semantic Parsing
EMNLP 2025
UniDebugger: Hierarchical Multi-Agent Framework for Unified Software Debugging
EMNLP 2025
Understanding the Thinking Process of Reasoning Models: A Perspective from Schoenfeld’s Episode Theory
EMNLP 2025
Towards a Unified Paradigm of Concept Editing in Large Language Models
EMNLP 2025
FacLens: Transferable Probe for Foreseeing Non-Factuality in Fact-Seeking Question Answering of Large Language Models
EMNLP 2025
Group-SAE: Efficient Training of Sparse Autoencoders for Large Language Models via Layer Groups
EMNLP 2025
Identifying Pre-training Data in LLMs: A Neuron Activation-Based Detection Framework
EMNLP 2025
ReSURE: Regularizing Supervision Unreliability for Multi-turn Dialogue Fine-tuning
EMNLP 2025
Artificial Impressions: Evaluating Large Language Model Behavior Through the Lens of Trait Impressions
EMNLP 2025
Beyond the Score: Uncertainty-Calibrated LLMs for Automated Essay Assessment
EMNLP 2025
A Graph-Theoretical Framework for Analyzing the Behavior of Causal Language Models
EMNLP 2025
Developing a Reliable, Fast, General-Purpose Hallucination Detection and Mitigation Service
NAACL 2025
<
1
…
56
57
58
…
293
>