conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Interpretability
7,318 papers
Papers per year
2003: 1
2006: 1
2007: 1
2008: 1
2009: 1
2010: 5
2012: 2
2013: 10
2014: 7
2015: 14
2016: 27
2017: 84
2018: 196
2019: 395
2020: 488
2021: 771
2022: 823
2023: 954
2024: 1360
2025: 1713
2026: 464
Papers
Do Zombies Understand? A Choose-Your-Own-Adventure Exploration of Machine Cognition
ACL 2024
“My Answer is C”: First-Token Probabilities Do Not Match Text Answers in Instruction-Tuned Language Models
ACL 2024
Large Language Models Relearn Removed Concepts
ACL 2024
Machine-Generated Text Localization
ACL 2024
Fact-and-Reflection (FaR) Improves Confidence Calibration of Large Language Models
ACL 2024
Error Analysis Prompting Enables Human-Like Translation Evaluation in Large Language Models
ACL 2024
Rationales for Answers to Simple Math Word Problems Confuse Large Language Models
ACL 2024
EX-FEVER: A Dataset for Multi-hop Explainable Fact Verification
ACL 2024
Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification
ACL 2024
Do Pre-Trained Language Models Detect and Understand Semantic Underspecification? Ask the DUST!
ACL 2024
Understanding and Patching Compositional Reasoning in LLMs
ACL 2024
LLM Factoscope: Uncovering LLMs’ Factual Discernment through Measuring Inner States
ACL 2024
imapScore: Medical Fact Evaluation Made Easy
ACL 2024
Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-Tuning
ACL 2024
DORY: Deliberative Prompt Recovery for LLM
ACL 2024
Data Contamination Calibration for Black-box LLMs
ACL 2024
Truth-Aware Context Selection: Mitigating Hallucinations of Large Language Models Being Misled by Untruthful Contexts
ACL 2024
Peering into the Mind of Language Models: An Approach for Attribution in Contextual Question Answering
ACL 2024
TimeToM: Temporal Space is the Key to Unlocking the Door of Large Language Models’ Theory-of-Mind
ACL 2024
Identifying and Mitigating Annotation Bias in Natural Language Understanding using Causal Mediation Analysis
ACL 2024
Perturbed examples reveal invariances shared by language models
ACL 2024
Investigating the Impact of Model Instability on Explanations and Uncertainty
ACL 2024
Discovering influential text using convolutional neural networks
ACL 2024
X-ACE: Explainable and Multi-factor Audio Captioning Evaluation
ACL 2024
Enhanced Language Model Truthfulness with Learnable Intervention and Uncertainty Expression
ACL 2024
<
1
…
107
108
109
…
293
>