Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Interpretability
7318 directly classified papers
Papers per year
2003: 1
2006: 1
2007: 1
2008: 1
2009: 1
2010: 5
2012: 2
2013: 10
2014: 7
2015: 14
2016: 27
2017: 84
2018: 196
2019: 395
2020: 488
2021: 771
2022: 823
2023: 954
2024: 1360
2025: 1713
2026: 464
Papers
DEL-ToM: Inference-Time Scaling for Theory-of-Mind Reasoning via Dynamic Epistemic Logic
EMNLP 2025
Should I Share this Translation? Evaluating Quality Feedback for User Reliance on Machine Translation
EMNLP 2025
Ask-Before-Detection: Identifying and Mitigating Conformity Bias in LLM-Powered Error Detector for Math Word Problem Solutions
ACL 2025
Polysemantic Dropout: Conformal OOD Detection for Specialized LLMs
EMNLP 2025
Disentangling Memory and Reasoning Ability in Large Language Models
ACL 2025
A Practical Method for Generating String Counterfactuals
NAACL 2025
Improve Decoding Factuality by Token-wise Cross Layer Entropy of Large Language Models
NAACL 2025
Comprehensive Layer-wise Analysis of SSL Models for Audio Deepfake Detection
NAACL 2025
Attention on Multiword Expressions: A Multilingual Study of BERT-based Models with Regard to Idiomaticity and Microsyntax
NAACL 2025
From Text to Emoji: How PEFT-Driven Personality Manipulation Unleashes the Emoji Potential in LLMs
NAACL 2025
Induction Heads as an Essential Mechanism for Pattern Matching in In-context Learning
NAACL 2025
On the Feasibility of In-Context Probing for Data Attribution
NAACL 2025
Neuro-Symbolic Integration Brings Causal and Reliable Reasoning Proofs
NAACL 2025
Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference
NAACL 2025
MedThink: A Rationale-Guided Framework for Explaining Medical Visual Question Answering
NAACL 2025
Features that Make a Difference: Leveraging Gradients for Improved Dictionary Learning
NAACL 2025
Language Model Meets Prototypes: Towards Interpretable Text Classification Models through Prototypical Networks
AAAI 2025
Explainable ICD Coding via Entity Linking
NAACL 2025
Who We Are, Where We Are: Mental Health at the Intersection of Person, Situation, and Large Language Models
NAACL 2025
Bridging the Faithfulness Gap in Prototypical Models
NAACL 2025
Error Reflection Prompting: Can Large Language Models Successfully Understand Errors?
NAACL 2025
Probing Internal Representations of Multi-Word Verbs in Large Language Models
NAACL 2025
The AI Co-Ethnographer: How Far Can Automation Take Qualitative Research?
NAACL 2025
VLG-BERT: Towards Better Interpretability in LLMs through Visual and Linguistic Grounding
NAACL 2025
Learning About Algorithm Auditing in Five Steps: Scaffolding How High School Youth Can Systematically and Critically Evaluate Machine Learning Applications
AAAI 2025
<
1
…
30
31
32
…
293
>