conftrace_

Artificial Intelligence › Core AI ›

Interpretability

7,318 papers

Papers per year

Papers

Transformer Attention vs Human Attention in Anaphora Resolution ACL 2024

Learning to Trust Your Feelings: Leveraging Self-awareness in LLMs for Hallucination Mitigation ACL 2024

Safe-Embed: Unveiling the Safety-Critical Knowledge of Sentence Encoders ACL 2024

Measuring the Inconsistency of Large Language Models in Preferential Ranking ACL 2024

Unlocking the Potential of Large Language Models for Clinical Text Anonymization: A Comparative Study ACL 2024

Word Boundary Information Isn’t Useful for Encoder Language Models ACL 2024

Tracking linguistic information in transformer-based sentence embeddings through targeted sparsification ACL 2024

Evaluating the Effectiveness of Retrieval-Augmented Large Language Models in Scientific Document Reasoning ACL 2024

AffilGood: Building reliable institution name disambiguation tools to improve scientific literature analysis ACL 2024

Automatic Quote Attribution in Chinese Literary Works ACL 2024

Deloitte at #SMM4H 2024: Can GPT-4 Detect COVID-19 Tweets Annotated by Itself? ACL 2024

Occam’s Razor and Bender and Koller’s Octopus ACL 2024

Towards Understanding Attention-based Reasoning through Graph Structures in Medical Codes Classification ACL 2024

Leveraging Graph Structures to Detect Hallucinations in Large Language Models ACL 2024

Zhenmei at WASSA-2024 Empathy and Personality Shared Track 2 Incorporating Pearson Correlation Coefficient as a Regularization Term for Enhanced Empathy and Emotion Prediction in Conversational Turns ACL 2024

Boundary-Aware Uncertainty for Feature Attribution Explainers AISTATS 2024

Density Uncertainty Layers for Reliable Uncertainty Estimation AISTATS 2024

Looping in the Human: Collaborative and Explainable Bayesian Optimization AISTATS 2024

Quantifying Uncertainty in Natural Language Explanations of Large Language Models AISTATS 2024

Analyzing Explainer Robustness via Probabilistic Lipschitzness of Prediction Functions AISTATS 2024

Neural Additive Models for Location Scale and Shape: A Framework for Interpretable Neural Regression Beyond the Mean AISTATS 2024

Interpretability Guarantees with Merlin-Arthur Classifiers AISTATS 2024

Tackling the XAI Disagreement Problem with Regional Explanations AISTATS 2024

Sparse and Faithful Explanations Without Sparse Models AISTATS 2024

Unveiling Latent Causal Rules: A Temporal Point Process Approach for Abnormal Event Explanation AISTATS 2024