conftrace_

Artificial Intelligence › Core AI ›

Interpretability

7,318 papers

Papers per year

Papers

Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models ACL 2024

Don’t Go To Extremes: Revealing the Excessive Sensitivity and Calibration Limitations of LLMs in Implicit Hate Speech Detection ACL 2024

Generating and Evaluating Plausible Explanations for Knowledge Graph Completion ACL 2024

VisDiaHalBench: A Visual Dialogue Benchmark For Diagnosing Hallucination in Large Vision-Language Models ACL 2024

Transferable and Efficient Non-Factual Content Detection via Probe Training with Offline Consistency Checking ACL 2024

What Do Language Models Learn in Context? The Structured Task Hypothesis. ACL 2024

TaPERA: Enhancing Faithfulness and Interpretability in Long-Form Table QA by Content Planning and Execution-based Reasoning ACL 2024

CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation of Large Language Model Generation ACL 2024

Context versus Prior Knowledge in Language Models ACL 2024

Bridging Word-Pair and Token-Level Metaphor Detection with Explainable Domain Mining ACL 2024

Faithful Logical Reasoning via Symbolic Chain-of-Thought ACL 2024

ESCoT: Towards Interpretable Emotional Support Dialogue Systems ACL 2024

When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards ACL 2024

LLM-Rubric: A Multidimensional, Calibrated Approach to Automated Evaluation of Natural Language Texts ACL 2024

NEO-BENCH: Evaluating Robustness of Large Language Models with Neologisms ACL 2024

Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMs ACL 2024

Transparent and Scrutable Recommendations Using Natural Language User Profiles ACL 2024

Fora: A corpus and framework for the study of facilitated dialogue ACL 2024

Tracking the Newsworthiness of Public Documents ACL 2024

The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models ACL 2024

CausalGym: Benchmarking causal interpretability methods on linguistic tasks ACL 2024

Mission: Impossible Language Models ACL 2024

PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and Evaluation of Multi-agent System Safety ACL 2024

Media Framing: A typology and Survey of Computational Approaches Across Disciplines ACL 2024

Calibrating Large Language Models Using Their Generations Only ACL 2024