Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Interpretability
7318 directly classified papers
Papers per year
2003: 1
2006: 1
2007: 1
2008: 1
2009: 1
2010: 5
2012: 2
2013: 10
2014: 7
2015: 14
2016: 27
2017: 84
2018: 196
2019: 395
2020: 488
2021: 771
2022: 823
2023: 954
2024: 1360
2025: 1713
2026: 464
Papers
Steering Safely or Off a Cliff? Rethinking Specificity and Robustness in Inference-Time Interventions
EACL 2026
Now You Hear Me: Audio Narrative Attacks Against Large Audio–Language Models
EACL 2026
How Do LLMs Generate Contrastive Sentiments? A Mechanistic Perspective
EACL 2026
Evidential Semantic Entropy for LLM Uncertainty Quantification
EACL 2026
Tandem Training for Language Models
EACL 2026
Debate, Deliberate, Decide (D3): A Cost-Aware Adversarial Framework for Reliable and Interpretable LLM Evaluation
EACL 2026
Out of Distribution, Out of Luck: Process Rewards Misguide Reasoning Models
EACL 2026
Funny or Persuasive, but Not Both: Evaluating Fine-Grained Multi-Concept Control in LLMs
EACL 2026
CHiRPE: A Step Towards Real-World Clinical NLP with Clinician-Oriented Model Explanations
EACL 2026
LLMs Know More About Numbers than They Can Say
EACL 2026
Simplifying Outcomes of Language Model Component Analyses with ELIA
EACL 2026
Similar, but why? A Toolkit for Explaining Text Similarity
EACL 2026
RAGVUE: A Diagnostic View for Explainable and Automated Evaluation of Retrieval-Augmented Generation
EACL 2026
Thesis proposal: COGNILENS: Analyzing Cognitive Decline in Language Models for Alzheimer’s Monitoring
EACL 2026
From Sentences to Proof Trees: Leveraging Language Models for Structured Reasoning
EACL 2026
SAGE: An Agentic Explainer Framework for Interpreting SAE Features in Language Models
EACL 2026
Benchmarking and Mitigating the Impact of Noisy User Prompts in Medical VLMs via Cross-Modal Reflection
EACL 2026
Cognitive Effects and Biases in Large Language Models
EACL 2026
Don’t Judge Code by Its Cover: Exploring Biases in LLM Judges for Code Evaluation
EACL 2026
Bias in the Ear of the Listener: Assessing Sensitivity in Audio Language Models Across Linguistic, Demographic, and Positional Variations
EACL 2026
Detection of Adversarial Prompts with Model Predictive Entropy
EACL 2026
Unveiling Decision-Making in LLMs for Text Classification : Extraction of influential and interpretable concepts with Sparse Autoencoders
EACL 2026
Interpretable Graph-Language Modeling for Detecting Youth Illicit Drug Use
EACL 2026
Beyond Multiple Choice: Evaluating Steering Vectors for Summarization
EACL 2026
How Does Chain of Thought Think? Mechanistic Interpretability of Chain-of-Thought Reasoning with Sparse Autoencoding
AAAI 2026
<
1
…
12
13
14
…
293
>