Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Interpretability
7318 directly classified papers
Papers per year
2003: 1
2006: 1
2007: 1
2008: 1
2009: 1
2010: 5
2012: 2
2013: 10
2014: 7
2015: 14
2016: 27
2017: 84
2018: 196
2019: 395
2020: 488
2021: 771
2022: 823
2023: 954
2024: 1360
2025: 1713
2026: 464
Papers
Mapping from Meaning: Addressing the Miscalibration of Prompt-Sensitive Language Models
AAAI 2025
SkillVerse : Assessing and Enhancing LLMs with Tree Evaluation
ACL 2025
Exploiting the Shadows: Unveiling Privacy Leaks through Lower-Ranked Tokens in Large Language Models
ACL 2025
Enhancing Uncertainty Modeling with Semantic Graph for Hallucination Detection
AAAI 2025
Cross-Lingual Pitfalls: Automatic Probing Cross-Lingual Weakness of Multilingual Large Language Models
ACL 2025
LLM Braces: Straightening Out LLM Predictions with Relevant Sub-Updates
ACL 2025
Imitate Before Detect: Aligning Machine Stylistic Preference for Machine-Revised Text Detection
AAAI 2025
Identifying Query-Relevant Neurons in Large Language Models for Long-Form Texts
AAAI 2025
Nuance Matters: Probing Epistemic Consistency in Causal Reasoning
AAAI 2025
CodeHalu: Investigating Code Hallucinations in LLMs via Execution-based Verification
AAAI 2025
CER: Confidence Enhanced Reasoning in LLMs
ACL 2025
GAMEBoT: Transparent Assessment of LLM Reasoning in Games
ACL 2025
Extracting Interpretable Task-Specific Circuits from Large Language Models for Faster Inference
AAAI 2025
Beyond Facts: Evaluating Intent Hallucination in Large Language Models
ACL 2025
Unlocking General Long Chain-of-Thought Reasoning Capabilities of Large Language Models via Representation Engineering
ACL 2025
FLUE: Streamlined Uncertainty Estimation for Large Language Models
AAAI 2025
Calibrating Large Language Models with Sample Consistency
AAAI 2025
Adaptive Retrieval Without Self-Knowledge? Bringing Uncertainty Back Home
ACL 2025
Are the Hidden States Hiding Something? Testing the Limits of Factuality-Encoding Capabilities in LLMs
ACL 2025
Eliciting Causal Abilities in Large Language Models for Reasoning Tasks
AAAI 2025
Enhancing Automated Interpretability with Output-Centric Feature Descriptions
ACL 2025
ConSim: Measuring Concept-Based Explanations’ Effectiveness with Automated Simulatability
ACL 2025
Tagged Span Annotation for Detecting Translation Errors in Reasoning LLMs
EMNLP 2025
Towards Unifying Evaluation of Counterfactual Explanations: Leveraging Large Language Models for Human-Centric Assessments
AAAI 2025
Benchmarking and Understanding Compositional Relational Reasoning of LLMs
AAAI 2025
<
1
…
27
28
29
…
293
>