conftrace_

Artificial Intelligence › Core AI ›

Interpretability

7,318 papers

Papers per year

Papers

Do Zombies Understand? A Choose-Your-Own-Adventure Exploration of Machine Cognition ACL 2024

“My Answer is C”: First-Token Probabilities Do Not Match Text Answers in Instruction-Tuned Language Models ACL 2024

Large Language Models Relearn Removed Concepts ACL 2024

Machine-Generated Text Localization ACL 2024

Fact-and-Reflection (FaR) Improves Confidence Calibration of Large Language Models ACL 2024

Error Analysis Prompting Enables Human-Like Translation Evaluation in Large Language Models ACL 2024

Rationales for Answers to Simple Math Word Problems Confuse Large Language Models ACL 2024

EX-FEVER: A Dataset for Multi-hop Explainable Fact Verification ACL 2024

Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification ACL 2024

Do Pre-Trained Language Models Detect and Understand Semantic Underspecification? Ask the DUST! ACL 2024

Understanding and Patching Compositional Reasoning in LLMs ACL 2024

LLM Factoscope: Uncovering LLMs’ Factual Discernment through Measuring Inner States ACL 2024

imapScore: Medical Fact Evaluation Made Easy ACL 2024

Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-Tuning ACL 2024

DORY: Deliberative Prompt Recovery for LLM ACL 2024

Data Contamination Calibration for Black-box LLMs ACL 2024

Truth-Aware Context Selection: Mitigating Hallucinations of Large Language Models Being Misled by Untruthful Contexts ACL 2024

Peering into the Mind of Language Models: An Approach for Attribution in Contextual Question Answering ACL 2024

TimeToM: Temporal Space is the Key to Unlocking the Door of Large Language Models’ Theory-of-Mind ACL 2024

Identifying and Mitigating Annotation Bias in Natural Language Understanding using Causal Mediation Analysis ACL 2024

Perturbed examples reveal invariances shared by language models ACL 2024

Investigating the Impact of Model Instability on Explanations and Uncertainty ACL 2024

Discovering influential text using convolutional neural networks ACL 2024

X-ACE: Explainable and Multi-factor Audio Captioning Evaluation ACL 2024

Enhanced Language Model Truthfulness with Learnable Intervention and Uncertainty Expression ACL 2024