Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Interpretability
7318 directly classified papers
Papers per year
2003: 1
2006: 1
2007: 1
2008: 1
2009: 1
2010: 5
2012: 2
2013: 10
2014: 7
2015: 14
2016: 27
2017: 84
2018: 196
2019: 395
2020: 488
2021: 771
2022: 823
2023: 954
2024: 1360
2025: 1713
2026: 464
Papers
Understanding Subword Compositionality of Large Language Models
EMNLP 2025
WildDoc: How Far Are We from Achieving Comprehensive and Robust Document Understanding in the Wild?
EMNLP 2025
Cross-Refine: Improving Natural Language Explanation Generation by Learning in Tandem
COLING 2025
ThoughtProbe: Classifier-Guided LLM Thought Space Exploration via Probing Representations
EMNLP 2025
Rethinking Backdoor Detection Evaluation for Language Models
EMNLP 2025
Exploring Concept Depth: How Large Language Models Acquire Knowledge and Concept at Different Layers?
COLING 2025
Follow the Flow: Fine-grained Flowchart Attribution with Neurosymbolic Agents
EMNLP 2025
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
EMNLP 2025
Language Models Encode the Value of Numbers Linearly
COLING 2025
CAST: Cross-modal Alignment Similarity Test for Vision Language Models
COLING 2025
CLEV: LLM-Based Evaluation Through Lightweight Efficient Voting for Free-Form Question-Answering
IJCNLP 2025
Unveiling the Influence of Amplifying Language-Specific Neurons
IJCNLP 2025
Isolating Culture Neurons in Multilingual Large Language Models
IJCNLP 2025
Learning from Hallucinations: Mitigating Hallucinations in LLMs via Internal Representation Intervention
IJCNLP 2025
Moral Self-correction is Not An Innate Capability in Language Models
IJCNLP 2025
Surprisal Dynamics for the Detection of Multi-Word Expressions in English
IJCNLP 2025
Structured Outputs in Prompt Engineering: Enhancing LLM Adaptability on Counterintuitive Instructions
IJCNLP 2025
Improving Explainable Fact-Checking with Claim-Evidence Correlations
COLING 2025
Multilingual Political Views of Large Language Models: Identification and Steering
IJCNLP 2025
Tree-of-Quote Prompting Improves Factuality and Attribution in Multi-Hop and Medical Reasoning
EMNLP 2025
AdaSteer: Your Aligned LLM is Inherently an Adaptive Jailbreak Defender
EMNLP 2025
Recoverable Anonymization for Pose Estimation: A Privacy-Enhancing Approach
WACV 2025
HalluCounter: Reference-free LLM Hallucination Detection in the Wild!
IJCNLP 2025
Modular Arithmetic: Language Models Solve Math Digit by Digit
IJCNLP 2025
To Generate or Discriminate? Methodological Considerations for Measuring Cultural Alignment in LLMs
IJCNLP 2025
<
1
…
46
47
48
…
293
>