Artificial Intelligence › Core AI ›

Interpretability

7318 directly classified papers

Papers per year

Papers

Explainable Depression Detection in Clinical Interviews with Personalized Retrieval-Augmented Generation ACL 2025

CodeHalu: Investigating Code Hallucinations in LLMs via Execution-based Verification AAAI 2025

A Tale of Evaluating Factual Consistency: Case Study on Long Document Summarization Evaluation ACL 2025

Derivative-Free Diffusion Manifold-Constrained Gradient for Unified XAI CVPR 2025

Towards Explainable Hate Speech Detection ACL 2025

Interpreting Multi-Attribute Confounding through Numerical Attributes in Large Language Models AACL 2025

Probing Subphonemes in Morphology Models ACL 2025

ClinStructor: AI-Powered Structuring of Unstructured Clinical Texts AACL 2025

FADE: Why Bad Descriptions Happen to Good Features ACL 2025

EIFBENCH: Extremely Complex Instruction Following Benchmark for Large Language Models EMNLP 2025

Quasi-symbolic Semantic Geometry over Transformer-based Variational AutoEncoder ACL 2025

Is There No Such Thing as a Bad Question? H4R: HalluciBot for Ratiocination, Rewriting, Ranking, and Routing AAAI 2025

Investigating Psychometric Predictive Power of Syntactic Attention ACL 2025

Attention-guided Self-reflection for Zero-shot Hallucination Detection in Large Language Models EMNLP 2025

RUC Team at SemEval-2025 Task 5: Fast Automated Subject Indexing: A Method Based on Similar Records Matching and Related Subject Ranking ACL 2025

HalluRAG-RUG at SemEval-2025 Task 3: Using Retrieval-Augmented Generation for Hallucination Detection in Model Outputs SEMEVAL 2025

Extended Abstract: Probing-Guided Parameter-Efficient Fine-Tuning for Balancing Linguistic Adaptation and Safety in LLM-based Social Influence Systems ACL 2025

DSVD: Dynamic Self-Verify Decoding for Faithful Generation in Large Language Models EMNLP 2025

MemeQA: Holistic Evaluation for Meme Understanding ACL 2025

On the Convergence of Moral Self-Correction in Large Language Models IJCNLP 2025

HumT DumT: Measuring and controlling human-like language in LLMs ACL 2025

Beyond the Score: Uncertainty-Calibrated LLMs for Automated Essay Assessment EMNLP 2025

Stochastic Chameleons: Irrelevant Context Hallucinations Reveal Class-Based (Mis)Generalization in LLMs ACL 2025

A Graph-Theoretical Framework for Analyzing the Behavior of Causal Language Models EMNLP 2025

The Impact of Negated Text on Hallucination with Large Language Models EMNLP 2025