conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Interpretability
7,318 papers
Papers per year
2003: 1
2006: 1
2007: 1
2008: 1
2009: 1
2010: 5
2012: 2
2013: 10
2014: 7
2015: 14
2016: 27
2017: 84
2018: 196
2019: 395
2020: 488
2021: 771
2022: 823
2023: 954
2024: 1360
2025: 1713
2026: 464
Papers
MedThink: A Rationale-Guided Framework for Explaining Medical Visual Question Answering
NAACL 2025
Features that Make a Difference: Leveraging Gradients for Improved Dictionary Learning
NAACL 2025
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers
NAACL 2025
Gradient-guided Attention Map Editing: Towards Efficient Contextual Hallucination Mitigation
NAACL 2025
Explainability for NLP in Pharmacovigilance: A Study on Adverse Event Report Triage in Swedish
NAACL 2025
Explainable ICD Coding via Entity Linking
NAACL 2025
Who We Are, Where We Are: Mental Health at the Intersection of Person, Situation, and Large Language Models
NAACL 2025
Bridging the Faithfulness Gap in Prototypical Models
NAACL 2025
Error Reflection Prompting: Can Large Language Models Successfully Understand Errors?
NAACL 2025
Interpretable Models for Detecting Linguistic Variation in Russian Media: Towards Unveiling Propagandistic Strategies during the Russo-Ukrainian War
NAACL 2025
Probing Internal Representations of Multi-Word Verbs in Large Language Models
NAACL 2025
The AI Co-Ethnographer: How Far Can Automation Take Qualitative Research?
NAACL 2025
VLG-BERT: Towards Better Interpretability in LLMs through Visual and Linguistic Grounding
NAACL 2025
Ambiguity Detection and Uncertainty Calibration for Question Answering with Large Language Models
NAACL 2025
Smaller Large Language Models Can Do Moral Self-Correction
NAACL 2025
Error Detection for Multimodal Classification
NAACL 2025
Know What You do Not Know: Verbalized Uncertainty Estimation Robustness on Corrupted Images in Vision-Language Models
NAACL 2025
Bias A-head? Analyzing Bias in Transformer-Based Language Model Attention Heads
NAACL 2025
A Calibrated Reflection Approach for Enhancing Confidence Estimation in LLMs
NAACL 2025
Evaluating Design Choices in Verifiable Generation with Open-source Models
NAACL 2025
Disentangling Linguistic Features with Dimension-Wise Analysis of Vector Embeddings
NAACL 2025
Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation
NAACL 2025
Holmes: Localizing Irregularities in LLM Training with Mega-scale GPU Clusters
NSDI 2025
Learning Interpretable Features from Interventions
RSS 2025
REFIND at SemEval-2025 Task 3: Retrieval-Augmented Factuality Hallucination Detection in Large Language Models
SEMEVAL 2025
<
1
…
83
84
85
…
293
>