Artificial Intelligence › Core AI ›

Interpretability

7318 directly classified papers

Papers per year

Papers

DecepBench: Benchmarking Multimodal Deception Detection ACL 2025

CLAIM: An Intent-Driven Multi-Agent Framework for Analyzing Manipulation in Courtroom Dialogues ACL 2025

YNU-HPCC at SemEval-2025 Task3: Leveraging Zero-Shot Learning for Halluciantion Detection ACL 2025

DUTJBD at SemEval-2025 Task 3: A Range of Approaches for Predicting Hallucination Generation in Models ACL 2025

Howard University-AI4PC at SemEval-2025 Task 10: Ensembling LLMs for Multi-lingual Multi-Label and Multi-Class Meta-Classification ACL 2025

TartanTritons at SemEval-2025 Task 10: Multilingual Hierarchical Entity Classification and Narrative Reasoning using Instruct-Tuned LLMs ACL 2025

Understanding Verbatim Memorization in LLMs Through Circuit Discovery ACL 2025

VerbaNexAI at SemEval-2025 Task 3: Fact Retrieval with Google Snippets for LLM Context Filtering to identify Hallucinations ACL 2025

Tracing and Dissecting How LLMs Recall Factual Knowledge for Real World Questions ACL 2025

Beyond the Answer: Advancing Multi-Hop QA with Fine-Grained Graph Reasoning and Evaluation ACL 2025

Rethinking Backdoor Detection Evaluation for Language Models EMNLP 2025

Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception ACL 2025

A-I-RAVEN and I-RAVEN-Mesh: Two New Benchmarks for Abstract Visual Reasoning IJCAI 2025

LaTIM: Measuring Latent Token-to-Token Interactions in Mamba Models ACL 2025

ThoughtProbe: Classifier-Guided LLM Thought Space Exploration via Probing Representations EMNLP 2025

Math Neurosurgery: Isolating Language Models’ Math Reasoning Abilities Using Only Forward Passes ACL 2025

Curriculum Abductive Learning for Mitigating Reasoning Shortcuts IJCAI 2025

PCoT: Persuasion-Augmented Chain of Thought for Detecting Fake News and Social Media Disinformation ACL 2025

LingConv: An Interactive Toolkit for Controlled Paraphrase Generation with Linguistic Attribute Control EMNLP 2025

Estimating Privacy Leakage of Augmented Contextual Knowledge in Language Models ACL 2025

Run Like a Neural Network, Explain Like k-Nearest Neighbor IJCAI 2025

Shaping the Safety Boundaries: Understanding and Defending Against Jailbreaks in Large Language Models ACL 2025

Rule-Guided Graph Neural Networks for Explainable Knowledge Graph Reasoning AAAI 2025

Do Language Models Have Semantics? On the Five Standard Positions ACL 2025

Things Machine Learning Models Know That They Don’t Know AAAI 2025