Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Interpretability
7318 directly classified papers
Papers per year
2003: 1
2006: 1
2007: 1
2008: 1
2009: 1
2010: 5
2012: 2
2013: 10
2014: 7
2015: 14
2016: 27
2017: 84
2018: 196
2019: 395
2020: 488
2021: 771
2022: 823
2023: 954
2024: 1360
2025: 1713
2026: 464
Papers
Deconstructing Instruction-Following: A New Benchmark for Granular Evaluation of Large Language Model Instruction Compliance Abilities
EACL 2026
Say It Another Way: Auditing LLMs with a User-Grounded Automated Paraphrasing Framework
EACL 2026
How Reliable are Confidence Estimators for Large Reasoning Models? A Systematic Benchmark on High-Stakes Domains
EACL 2026
SearchLLM: Detecting LLM Paraphrased Text by Measuring the Similarity with Regeneration of the Candidate Source via Search Engine
EACL 2026
Mind the Gap: Benchmarking LLM Uncertainty and Calibration with Specialty-Aware Clinical QA and Reasoning-Based Behavioural Features
EACL 2026
Can Activation Steering Generalize Across Languages? A Study on Syllogistic Reasoning in Language Models
EACL 2026
Safe-Unsafe Concept Separation Emerges from a Single Direction in Language Models Activation Space
EACL 2026
When Meanings Meet: Investigating the Emergence and Quality of Shared Concept Spaces during Multilingual Language Model Training
EACL 2026
A Unified View on Emotion Representation in Large Language Models
EACL 2026
TRACE: A Framework for Analyzing and Enhancing Stepwise Reasoning in Vision-Language Models
EACL 2026
Spotlight Your Instructions: Instruction-following with Dynamic Attention Steering
EACL 2026
FaithLM: Towards Faithful Explanations for Large Language Models
EACL 2026
Journey Before Destination: On the importance of Visual Faithfulness in Slow Thinking
EACL 2026
HateXScore: A Metric Suite for Evaluating Reasoning Quality in Hate Speech Explanations
EACL 2026
Attribution-Guided Multi-Object Hallucination and Bias Detection in Vision-Language Models
EACL 2026
Word Surprisal Correlates with Sentential Contradiction in LLMs
EACL 2026
Knowing the Facts but Choosing the Shortcut: Understanding How Large Language Models Compare Entities
EACL 2026
Recursive numeral systems are highly regular and easy to process
EACL 2026
MEVER: Multi-Modal and Explainable Claim Verification with Graph-based Evidence Retrieval
EACL 2026
DuwatBench: Bridging Language and Visual Heritage through an Arabic Calligraphy Benchmark for Multimodal Understanding
EACL 2026
Beyond Math: Stories as a Testbed for Memorization-Constrained Reasoning in LLMs
EACL 2026
Neural Breadcrumbs: Membership Inference Attacks on LLMs Through Hidden State and Attention Pattern Analysis
EACL 2026
Steering Safely or Off a Cliff? Rethinking Specificity and Robustness in Inference-Time Interventions
EACL 2026
Now You Hear Me: Audio Narrative Attacks Against Large Audio–Language Models
EACL 2026
Feature Drift: How Fine-Tuning Repurposes Representations in LLMs
EACL 2026
<
1
…
6
7
8
…
293
>